Hi!
I added a standalone Redis instance to our Thanos setup to act as a query/index/bucket cache. The instance crashes almost daily, usually when there is some heavy metrics query ongoing.
Crash report
Paste the complete crash log between the quotes below. Please include a few lines from the log preceding the crash report to provide some context.
1:M 15 Jan 2023 18:37:25.910 * Background saving terminated with success
1:M 15 Jan 2023 18:42:26.089 * 100 changes in 300 seconds. Saving...
1:M 15 Jan 2023 18:42:26.120 * Background saving started by pid 149964
149964:C 15 Jan 2023 18:42:43.756 * DB saved on disk
149964:C 15 Jan 2023 18:42:43.796 * RDB: 30 MB of memory used by copy-on-write
1:M 15 Jan 2023 18:42:43.878 * Background saving terminated with success
1:M 15 Jan 2023 18:45:12.482 * 10000 changes in 60 seconds. Saving...
1:M 15 Jan 2023 18:45:12.514 * Background saving started by pid 150704
150704:C 15 Jan 2023 18:45:44.463 * DB saved on disk
150704:C 15 Jan 2023 18:45:44.506 * RDB: 264 MB of memory used by copy-on-write
1:M 15 Jan 2023 18:45:44.582 * Background saving terminated with success
=== REDIS BUG REPORT START: Cut & paste starting from here ===
1:M 15 Jan 2023 18:46:35.846 # === ASSERTION FAILED ===
1:M 15 Jan 2023 18:46:35.846 # ==> server.c:2442 'listLength(server.tracking_pending_keys) == 0' is not true
------ STACK TRACE ------
Backtrace:
redis-server *:6379(+0x4b364)[0x55e37ab75364]
redis-server *:6379(aeProcessEvents+0x1b0)[0x55e37ab71f10]
redis-server *:6379(aeMain+0x25)[0x55e37ab72285]
redis-server *:6379(main+0x326)[0x55e37ab6e2e6]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x7fdddb82bd0a]
redis-server *:6379(_start+0x2a)[0x55e37ab6e7ca]
------ INFO OUTPUT ------
# Server
redis_version:6.2.8
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:98494b835603ba45
redis_mode:standalone
os:Linux 5.10.135 x86_64
arch_bits:64
monotonic_clock:POSIX clock_gettime
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:10.2.1
process_id:1
process_supervised:no
run_id:e2fb880485c1dbf8fc8c3f5ecdc72e4a6d20cfa7
tcp_port:6379
server_time_usec:1673808395841143
uptime_in_seconds:33794
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:12864011
executable:/redis-server
config_file:
io_threads_active:0
# Clients
connected_clients:68
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:1139
client_recent_max_output_buffer:1742840
blocked_clients:0
tracking_clients:20
clients_in_timeout_table:0
# Memory
used_memory:3650389304
used_memory_human:3.40G
used_memory_rss:4392181760
used_memory_rss_human:4.09G
used_memory_peak:3911698760
used_memory_peak_human:3.64G
used_memory_peak_perc:93.32%
used_memory_overhead:178157435
used_memory_startup:812000
used_memory_dataset:3472231869
used_memory_dataset_perc:95.14%
allocator_allocated:3650570168
allocator_active:4395503616
allocator_resident:4460855296
total_system_memory:33112240128
total_system_memory_human:30.84G
used_memory_lua:30720
used_memory_lua_human:30.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:16567500800
maxmemory_human:15.43G
maxmemory_policy:noeviction
allocator_frag_ratio:1.20
allocator_frag_bytes:744933448
allocator_rss_ratio:1.01
allocator_rss_bytes:65351680
rss_overhead_ratio:0.98
rss_overhead_bytes:-68673536
mem_fragmentation_ratio:1.20
mem_fragmentation_bytes:741676360
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:4081307
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:422745
rdb_bgsave_in_progress:0
rdb_last_save_time:1673808344
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:32
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:277131264
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
# Stats
total_connections_received:15902
total_commands_processed:6482917
instantaneous_ops_per_sec:147284
total_net_input_bytes:33956439657
total_net_output_bytes:34097392051
instantaneous_input_kbps:16382.47
instantaneous_output_kbps:29372.94
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:728821
expired_stale_perc:0.71
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:6585
evicted_keys:0
keyspace_hits:4091058
keyspace_misses:2367888
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:31658
total_forks:115
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:1003824
tracking_total_items:1908813
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:0
dump_payload_sanitizations:0
total_reads_processed:4683151
total_writes_processed:2556987
io_threaded_reads_processed:0
io_threaded_writes_processed:0
# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:d9dc7936975ebdf775073000b2d961ae7567c0ea
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:98.411621
used_cpu_user:66.121712
used_cpu_sys_children:196.935224
used_cpu_user_children:1582.288453
used_cpu_sys_main_thread:94.889139
used_cpu_user_main_thread:66.041100
# Modules
# Commandstats
cmdstat_mget:calls=404558,usec=18771204,usec_per_call=46.40,rejected_calls=0,failed_calls=0
cmdstat_ping:calls=349214,usec=268704,usec_per_call=0.77,rejected_calls=0,failed_calls=0
cmdstat_config:calls=2253,usec=102889,usec_per_call=45.67,rejected_calls=0,failed_calls=0
cmdstat_info:calls=2253,usec=156118,usec_per_call=69.29,rejected_calls=0,failed_calls=0
cmdstat_latency:calls=2253,usec=3588,usec_per_call=1.59,rejected_calls=0,failed_calls=0
cmdstat_setex:calls=365534,usec=1880596,usec_per_call=5.14,rejected_calls=0,failed_calls=0
cmdstat_multi:calls=401960,usec=29502,usec_per_call=0.07,rejected_calls=0,failed_calls=0
cmdstat_select:calls=131,usec=73,usec_per_call=0.56,rejected_calls=0,failed_calls=0
cmdstat_hello:calls=20,usec=71,usec_per_call=3.55,rejected_calls=0,failed_calls=0
cmdstat_set:calls=925213,usec=4013269,usec_per_call=4.34,rejected_calls=0,failed_calls=0
cmdstat_slowlog:calls=4506,usec=12320,usec_per_call=2.73,rejected_calls=0,failed_calls=0
cmdstat_client:calls=398007,usec=141382,usec_per_call=0.36,rejected_calls=0,failed_calls=0
cmdstat_pttl:calls=3225059,usec=3944318,usec_per_call=1.22,rejected_calls=0,failed_calls=0
cmdstat_exec:calls=401956,usec=13672789,usec_per_call=34.02,rejected_calls=0,failed_calls=0
# Errorstats
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=1937917,expires=1937917,avg_ttl=47987770
db1:keys=176090,expires=176090,avg_ttl=60195014
db2:keys=2397,expires=2397,avg_ttl=70262785
------ CLIENT LIST OUTPUT ------
id=44 addr=10.140.84.24:42166 laddr=10.140.86.124:6379 fd=32 name= age=33746 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=0
id=33 addr=10.140.80.32:57870 laddr=10.140.86.124:6379 fd=23 name= age=33746 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=0
id=15820 addr=10.140.80.28:40420 laddr=10.140.86.124:6379 fd=33 name= age=89 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=34 addr=10.140.80.32:57886 laddr=10.140.86.124:6379 fd=24 name= age=33746 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=0
id=35 addr=10.140.80.32:57892 laddr=10.140.86.124:6379 fd=25 name= age=33746 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=0
id=8 addr=10.140.79.163:51988 laddr=10.140.86.124:6379 fd=10 name= age=33770 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=0
id=9 addr=10.140.79.163:39484 laddr=10.140.86.124:6379 fd=11 name= age=33770 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=0
id=15815 addr=10.140.76.252:40158 laddr=10.140.86.124:6379 fd=18 name= age=95 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15816 addr=10.140.76.252:40152 laddr=10.140.86.124:6379 fd=19 name= age=95 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=18 addr=10.140.80.32:40822 laddr=10.140.86.124:6379 fd=12 name= age=33756 idle=0 flags=xt db=0 sub=0 psub=0 multi=10000 qbuf=9 qbuf-free=40945 argv-mem=124302 obl=16358 oll=10 omem=205040 tot-mem=472702 events=r cmd=pttl user=default redir=0
id=19 addr=10.140.80.32:40832 laddr=10.140.86.124:6379 fd=13 name= age=33756 idle=0 flags=xt db=0 sub=0 psub=0 multi=10000 qbuf=22 qbuf-free=40932 argv-mem=13608 obl=16367 oll=7 omem=143528 tot-mem=300496 events=r cmd=pttl user=default redir=0
id=28 addr=10.140.84.24:42110 laddr=10.140.86.124:6379 fd=20 name= age=33750 idle=0 flags=t db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=16381 oll=12 omem=246048 tot-mem=307528 events=r cmd=set user=default redir=0
id=29 addr=10.140.84.24:42116 laddr=10.140.86.124:6379 fd=21 name= age=33749 idle=0 flags=xt db=0 sub=0 psub=0 multi=8928 qbuf=2 qbuf-free=40952 argv-mem=184486 obl=16358 oll=7 omem=143528 tot-mem=471374 events=r cmd=pttl user=default redir=0
id=20 addr=10.140.80.32:40808 laddr=10.140.86.124:6379 fd=14 name= age=33756 idle=0 flags=xt db=0 sub=0 psub=0 multi=8664 qbuf=29 qbuf-free=40925 argv-mem=224890 obl=16358 oll=8 omem=164032 tot-mem=532282 events=r cmd=pttl user=default redir=0
id=21 addr=10.140.80.32:40830 laddr=10.140.86.124:6379 fd=15 name= age=33756 idle=0 flags=t db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=16358 oll=7 omem=143528 tot-mem=164056 events=r cmd=set user=default redir=0
id=6 addr=10.140.79.163:51960 laddr=10.140.86.124:6379 fd=8 name= age=33771 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=0
id=7 addr=10.140.79.163:39468 laddr=10.140.86.124:6379 fd=9 name= age=33771 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=0
id=15817 addr=10.140.80.28:40404 laddr=10.140.86.124:6379 fd=26 name= age=92 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=32 addr=10.140.80.32:57868 laddr=10.140.86.124:6379 fd=22 name= age=33746 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=0
id=39 addr=10.140.84.24:42134 laddr=10.140.86.124:6379 fd=27 name= age=33746 idle=0 flags=t db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=16358 oll=9 omem=184536 tot-mem=246016 events=r cmd=set user=default redir=0
id=15823 addr=10.140.80.28:40444 laddr=10.140.86.124:6379 fd=34 name= age=83 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15824 addr=10.140.80.28:40434 laddr=10.140.86.124:6379 fd=35 name= age=83 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15825 addr=10.140.80.28:40496 laddr=10.140.86.124:6379 fd=36 name= age=83 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15826 addr=10.140.80.28:40470 laddr=10.140.86.124:6379 fd=37 name= age=83 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15827 addr=10.140.80.28:40504 laddr=10.140.86.124:6379 fd=38 name= age=83 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15828 addr=10.140.80.28:40486 laddr=10.140.86.124:6379 fd=39 name= age=83 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15829 addr=10.140.80.28:40430 laddr=10.140.86.124:6379 fd=40 name= age=83 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15830 addr=10.140.80.28:40466 laddr=10.140.86.124:6379 fd=41 name= age=83 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15831 addr=10.140.80.28:40450 laddr=10.140.86.124:6379 fd=42 name= age=83 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15832 addr=10.140.80.28:40508 laddr=10.140.86.124:6379 fd=43 name= age=83 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15833 addr=10.140.80.28:40476 laddr=10.140.86.124:6379 fd=44 name= age=83 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15834 addr=10.140.80.28:40480 laddr=10.140.86.124:6379 fd=45 name= age=83 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15835 addr=10.140.76.252:45972 laddr=10.140.86.124:6379 fd=46 name= age=83 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15836 addr=10.140.76.252:45982 laddr=10.140.86.124:6379 fd=47 name= age=83 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15837 addr=10.140.76.252:46004 laddr=10.140.86.124:6379 fd=48 name= age=83 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15838 addr=10.140.76.252:46010 laddr=10.140.86.124:6379 fd=49 name= age=83 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15839 addr=10.140.76.252:46024 laddr=10.140.86.124:6379 fd=50 name= age=83 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15840 addr=10.140.76.252:46034 laddr=10.140.86.124:6379 fd=51 name= age=83 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15841 addr=10.140.76.252:46012 laddr=10.140.86.124:6379 fd=52 name= age=83 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15849 addr=10.140.76.252:50612 laddr=10.140.86.124:6379 fd=58 name= age=76 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15850 addr=10.140.76.252:50614 laddr=10.140.86.124:6379 fd=59 name= age=76 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15851 addr=10.140.76.252:50626 laddr=10.140.86.124:6379 fd=60 name= age=76 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15852 addr=10.140.76.252:50636 laddr=10.140.86.124:6379 fd=61 name= age=76 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15853 addr=10.140.76.252:50652 laddr=10.140.86.124:6379 fd=62 name= age=76 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15854 addr=10.140.76.252:50632 laddr=10.140.86.124:6379 fd=63 name= age=76 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15855 addr=10.140.76.252:50664 laddr=10.140.86.124:6379 fd=64 name= age=76 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15856 addr=10.140.76.252:50692 laddr=10.140.86.124:6379 fd=65 name= age=76 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15857 addr=10.140.76.252:50688 laddr=10.140.86.124:6379 fd=66 name= age=76 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15858 addr=10.140.76.252:50704 laddr=10.140.86.124:6379 fd=67 name= age=76 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15859 addr=10.140.76.252:50642 laddr=10.140.86.124:6379 fd=68 name= age=76 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15860 addr=10.140.76.252:50674 laddr=10.140.86.124:6379 fd=69 name= age=76 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15861 addr=10.140.76.252:50710 laddr=10.140.86.124:6379 fd=70 name= age=76 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
id=15862 addr=10.140.76.252:50702 laddr=10.140.86.124:6379 fd=71 name= age=76 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15863 addr=10.140.76.252:50718 laddr=10.140.86.124:6379 fd=72 name= age=76 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15864 addr=10.140.76.252:50714 laddr=10.140.86.124:6379 fd=73 name= age=76 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15842 addr=10.140.76.252:45956 laddr=10.140.86.124:6379 fd=53 name= age=83 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15843 addr=10.140.76.252:46014 laddr=10.140.86.124:6379 fd=54 name= age=83 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15844 addr=10.140.76.252:45992 laddr=10.140.86.124:6379 fd=55 name= age=83 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=40 addr=10.140.84.24:42130 laddr=10.140.86.124:6379 fd=28 name= age=33746 idle=0 flags=t db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=16358 oll=6 omem=123024 tot-mem=184504 events=r cmd=set user=default redir=0
id=41 addr=10.140.84.24:42150 laddr=10.140.86.124:6379 fd=29 name= age=33746 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61448 events=r cmd=exec user=default redir=0
id=42 addr=10.140.84.24:42160 laddr=10.140.86.124:6379 fd=30 name= age=33746 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61448 events=r cmd=exec user=default redir=0
id=43 addr=10.140.84.24:42164 laddr=10.140.86.124:6379 fd=31 name= age=33746 idle=0 flags=t db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61448 events=r cmd=exec user=default redir=0
id=15887 addr=10.140.76.252:55852 laddr=10.140.86.124:6379 fd=56 name= age=28 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15888 addr=10.140.76.252:55868 laddr=10.140.86.124:6379 fd=57 name= age=28 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15889 addr=10.140.76.252:55874 laddr=10.140.86.124:6379 fd=74 name= age=28 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15890 addr=10.140.76.252:55866 laddr=10.140.86.124:6379 fd=75 name= age=28 idle=27 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15809 addr=10.140.76.252:40146 laddr=10.140.86.124:6379 fd=16 name= age=102 idle=1 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=mget user=default redir=-1
id=15408 addr=10.140.76.252:58504 laddr=10.140.86.124:6379 fd=17 name= age=961 idle=28 flags=N db=2 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=exec user=default redir=-1
------ MODULES INFO OUTPUT ------
------ FAST MEMORY TEST ------
1:M 15 Jan 2023 18:46:35.847 # Bio thread for job type #0 terminated
1:M 15 Jan 2023 18:46:35.847 # Bio thread for job type #1 terminated
1:M 15 Jan 2023 18:46:35.847 # Bio thread for job type #2 terminated
*** Preparing to test memory region 55e37ad3d000 (2281472 bytes)
*** Preparing to test memory region 55e37c806000 (270336 bytes)
*** Preparing to test memory region 7fdc7c400000 (5437915136 bytes)
*** Preparing to test memory region 7fddc07fc000 (411041792 bytes)
*** Preparing to test memory region 7fddd8ffd000 (8388608 bytes)
*** Preparing to test memory region 7fddd97fe000 (8388608 bytes)
*** Preparing to test memory region 7fddd9fff000 (8388608 bytes)
*** Preparing to test memory region 7fddda800000 (8388608 bytes)
*** Preparing to test memory region 7fdddb000000 (8388608 bytes)
*** Preparing to test memory region 7fdddb802000 (24576 bytes)
*** Preparing to test memory region 7fdddb9d9000 (16384 bytes)
*** Preparing to test memory region 7fdddb9fb000 (16384 bytes)
*** Preparing to test memory region 7fdddbcef000 (16384 bytes)
*** Preparing to test memory region 7fdddbed0000 (8192 bytes)
*** Preparing to test memory region 7fdddbf06000 (4096 bytes)
.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O
Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.
=== REDIS BUG REPORT END. Make sure to include from START to END. ===
Additional information
-
OS distribution and version
-
Worker OS: Bottlerocket OS 1.10.1 (aws-k8s-1.21)
-
Steps to reproduce (if any)
-
Use a standalone instance (16 GB RAM / 1 CPU) as a query / index / bucket cache for Thanos (three databases).
- Run a heavy query with Grafana, which causes Thanos components to both read from and write to the cache.
Comment From: oranagra
i can't figure it out. tried comparing 6.2.8 to 7.0 to see maybe there's a bug that's already fixed. tried looking at sensitive areas like blocking commands, but i see this deployment doesn't use them (also no modules being used). suspected maybe it's somehow related to use of multi-exec and multiple databases, but i don't see how such a thing can cause that issue. @madolson @soloestoy @guybe7 @huangzhw maybe you can think of something.
Comment From: yossigo
@oranagra Could it be related to key expiration in an unexpected or uncommon flow?
Comment From: ranshid
I think I have a theory: We can see that some clients multi size is 10K so maybe it caused the trackingLimitUsedSlots that limits the size of the tracking table to push invalidations, but later on bail from the processCommand without calling after_command. Will try to reproduce. I am just having a hard time understanding what can cause the processCommand to bail
Comment From: madolson
Also, just to reinforce that theory. The default max tracked items is 1 million, and this node crashed with tracking_total_keys:1003824 items. Which is just over the limit.
Comment From: madolson
Also, unless I'm really off, won't just a regular queued command in a multi-exec cause this?
Comment From: madolson
~~I don't have an answer, but~~ I managed to corrupt the reply protocol. 1. If you reduce the number of tracked items to 5. 2. On client one, execute:
client tracking on
mget foo1 foo2 foo3 foo4 foo6
multi
- On client two, execute:
client tracking on
mget bar1 bar2 bar3 bar4 bar5
- Then back on client one execute:
ping
exec
The multi-response get's interleaved with the invalidations:
127.0.0.1:6379> client tracking on
OK
127.0.0.1:6379> mget foo1 foo2 foo3 foo4 foo6
1) "5"
2) (nil)
3) (nil)
4) (nil)
5) (nil)
127.0.0.1:6379> multi
OK
127.0.0.1:6379(TX)> ping
1) "invalidate"
2) 1) "foo3"
127.0.0.1:6379(TX)> exec
1) "invalidate"
2) 1) "foo2"
Note these have to be in the same event loop, so I disabled the serverCron while running this test. EDIT: This also can cause it to crash. Let me write a TCL case to reproduce this.