Crash report
Paste the complete crash log between the quotes below. Please include a few lines from the log preceding the crash report to provide some context.
4107:M 15 Oct 2021 15:50:53.052 * 10000 changes in 60 seconds. Saving...
4107:M 15 Oct 2021 15:50:54.259 * Background saving started by pid 4208
4208:C 15 Oct 2021 15:54:38.954 * DB saved on disk
4107:M 15 Oct 2021 15:54:39.897 * Background saving terminated with success
4107:M 15 Oct 2021 15:55:40.011 * 10000 changes in 60 seconds. Saving...
4107:M 15 Oct 2021 15:55:41.430 * Background saving started by pid 4209
4209:C 15 Oct 2021 15:59:00.674 * DB saved on disk
4107:M 15 Oct 2021 15:59:01.429 * Background saving terminated with success
4107:M 15 Oct 2021 16:00:02.052 * 10000 changes in 60 seconds. Saving...
4107:M 15 Oct 2021 16:00:03.216 * Background saving started by pid 4210
=== REDIS BUG REPORT START: Cut & paste starting from here ===
4210:C 15 Oct 2021 16:01:19.714 # Redis 6.2.6 crashed by signal: 11, si_code: 2
4210:C 15 Oct 2021 16:01:19.733 # Accessing address: 0x7f23d688c000
4210:C 15 Oct 2021 16:01:19.734 # Crashed running the instruction at: 0x7f2914245ff2
------ STACK TRACE ------
EIP:
redis-rdb-bgsave 127.0.0.1:6379(lzf_compress+0x192)[0x7f2914245ff2]
Backtrace:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7f2913c123c0]
redis-rdb-bgsave 127.0.0.1:6379(lzf_compress+0x192)[0x7f2914245ff2]
redis-rdb-bgsave 127.0.0.1:6379(+0x7dbb9)[0x7f2914268bb9]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveRawString+0x105)[0x7f2914268fe5]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveObject+0x1cd)[0x7f2914269ffd]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveKeyValuePair+0x89)[0x7f291426a759]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveRio+0x248)[0x7f291426aed8]
redis-rdb-bgsave 127.0.0.1:6379(rdbSave+0xc3)[0x7f291426d933]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveBackground+0xc3)[0x7f291426dc53]
redis-rdb-bgsave 127.0.0.1:6379(serverCron+0x5e0)[0x7f2914241ed0]
redis-rdb-bgsave 127.0.0.1:6379(aeProcessEvents+0x129)[0x7f2914233cb9]
redis-rdb-bgsave 127.0.0.1:6379(aeMain+0x1d)[0x7f29142340ed]
redis-rdb-bgsave 127.0.0.1:6379(main+0x337)[0x7f29142304b7]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f2913a170b3]
redis-rdb-bgsave 127.0.0.1:6379(_start+0x2e)[0x7f29142309fe]
------ REGISTERS ------
4210:C 15 Oct 2021 16:01:20.316 #
RAX:00007f23cbb20a33 RBX:00007f23ce92f773
RCX:0000000000000003 RDX:00007f23d4f09b00
RDI:00007f23c8eb14c9 RSI:000000000000049e
RBP:00007fffef0f1ef0 RSP:00007fffef0f1ee0
R8 :00007f23d688c000 R9 :00007f23d688c001
R10:00007f23da987da8 R11:00007f23ce92f775
R12:0000000000000004 R13:0000000000000024
R14:0000000000000108 R15:0000000000000002
RIP:00007f2914245ff2 EFL:0000000000010206
CSGSFS:00000053002b0033
4210:C 15 Oct 2021 16:01:20.316 # (00007fffef0f1eef) -> 02aa9f4e02bf9b9d
4210:C 15 Oct 2021 16:01:20.317 # (00007fffef0f1eee) -> 02ae8a0802c523fe
4210:C 15 Oct 2021 16:01:20.317 # (00007fffef0f1eed) -> 02b4ebe602c64c86
4210:C 15 Oct 2021 16:01:20.317 # (00007fffef0f1eec) -> 02b50c5602b42443
4210:C 15 Oct 2021 16:01:20.318 # (00007fffef0f1eeb) -> 02c5acc502c5836b
4210:C 15 Oct 2021 16:01:20.318 # (00007fffef0f1eea) -> 02c2e1df02bf85ea
4210:C 15 Oct 2021 16:01:20.318 # (00007fffef0f1ee9) -> 02b6678f02c4b561
4210:C 15 Oct 2021 16:01:20.319 # (00007fffef0f1ee8) -> 02c5f79d02c3ec8c
4210:C 15 Oct 2021 16:01:20.319 # (00007fffef0f1ee7) -> 02bfa8ed02c4e6ca
4210:C 15 Oct 2021 16:01:20.320 # (00007fffef0f1ee6) -> 0292308402bc30c5
4210:C 15 Oct 2021 16:01:20.320 # (00007fffef0f1ee5) -> 02a3b66f02bcc717
4210:C 15 Oct 2021 16:01:20.320 # (00007fffef0f1ee4) -> 02c2b62f02b577b7
4210:C 15 Oct 2021 16:01:20.320 # (00007fffef0f1ee3) -> 02bfdb9a02bb3a8b
4210:C 15 Oct 2021 16:01:20.321 # (00007fffef0f1ee2) -> 02bf9ba302c0ad76
4210:C 15 Oct 2021 16:01:20.321 # (00007fffef0f1ee1) -> 00000000000013f6
4210:C 15 Oct 2021 16:01:20.321 # (00007fffef0f1ee0) -> 0000000000000000
------ INFO OUTPUT ------
# Server
redis_version:6.2.6
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:e15ab29abece34d
redis_mode:standalone
os:Linux 4.4.0-17763-Microsoft x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:9.3.0
process_id:4210
process_supervised:no
run_id:a6ae669ee37e7fa9ee420f247ef96fb227773317
tcp_port:6379
server_time_usec:1634313602052831
uptime_in_seconds:27493
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:6923650
executable:/mnt/d/redis/redis-server
config_file:/mnt/d/redis/./redis.conf
io_threads_active:0
# Clients
connected_clients:4
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:458792
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
# Memory
used_memory:16528657104
used_memory_human:15.39G
used_memory_rss:17111629824
used_memory_rss_human:15.94G
used_memory_peak:16528657104
used_memory_peak_human:15.39G
used_memory_peak_perc:100.09%
used_memory_overhead:231737568
used_memory_startup:810024
used_memory_dataset:16296919536
used_memory_dataset_perc:98.60%
allocator_allocated:16428135384
allocator_active:18791809024
allocator_resident:18955481088
total_system_memory:68541775872
total_system_memory_human:63.83G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.14
allocator_frag_bytes:2363673640
allocator_rss_ratio:1.01
allocator_rss_bytes:163672064
rss_overhead_ratio:0.90
rss_overhead_bytes:-1843851264
mem_fragmentation_ratio:1.04
mem_fragmentation_bytes:683637824
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:123000
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:148893
rdb_bgsave_in_progress:0
rdb_last_save_time:1634313541
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:200
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:0
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
# Stats
total_connections_received:4
total_commands_processed:13481499
instantaneous_ops_per_sec:703
total_net_input_bytes:44096183651
total_net_output_bytes:4183144606
instantaneous_input_kbps:698.84
instantaneous_output_kbps:178.73
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:22389
expired_stale_perc:0.49
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:2140
evicted_keys:0
keyspace_hits:189119
keyspace_misses:35371
pubsub_channels:1
pubsub_patterns:0
latest_fork_usec:1418434
total_forks:106
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:4
dump_payload_sanitizations:0
total_reads_processed:13079322
total_writes_processed:11862000
io_threaded_reads_processed:0
io_threaded_writes_processed:0
# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:2bb0ae2281d18746402afefb05a66bc6eb7385a1
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:8.953125
used_cpu_user:12.265625
used_cpu_sys_children:0.000000
used_cpu_user_children:0.000000
used_cpu_sys_main_thread:8.953125
used_cpu_user_main_thread:12.265625
# Modules
# Commandstats
cmdstat_subscribe:calls=2,usec=9,usec_per_call=4.50,rejected_calls=0,failed_calls=0
cmdstat_setex:calls=13199980,usec=132908817,usec_per_call=10.07,rejected_calls=0,failed_calls=0
cmdstat_exists:calls=127278,usec=499418,usec_per_call=3.92,rejected_calls=0,failed_calls=0
cmdstat_expire:calls=3713,usec=18233,usec_per_call=4.91,rejected_calls=0,failed_calls=0
cmdstat_config:calls=6,usec=88,usec_per_call=14.67,rejected_calls=0,failed_calls=0
cmdstat_cluster:calls=2,usec=6,usec_per_call=3.00,rejected_calls=0,failed_calls=2
cmdstat_pttl:calls=276,usec=820,usec_per_call=2.97,rejected_calls=0,failed_calls=0
cmdstat_ping:calls=908,usec=1559,usec_per_call=1.72,rejected_calls=0,failed_calls=0
cmdstat_client:calls=4,usec=12,usec_per_call=3.00,rejected_calls=0,failed_calls=0
cmdstat_incrby:calls=9,usec=48,usec_per_call=5.33,rejected_calls=0,failed_calls=0
cmdstat_pexpireat:calls=41,usec=141,usec_per_call=3.44,rejected_calls=0,failed_calls=0
cmdstat_pfadd:calls=22,usec=158,usec_per_call=7.18,rejected_calls=0,failed_calls=0
cmdstat_zadd:calls=9,usec=19766,usec_per_call=2196.22,rejected_calls=0,failed_calls=0
cmdstat_incr:calls=44437,usec=352023,usec_per_call=7.92,rejected_calls=0,failed_calls=0
cmdstat_info:calls=910,usec=23106,usec_per_call=25.39,rejected_calls=0,failed_calls=0
cmdstat_psetex:calls=18,usec=188,usec_per_call=10.44,rejected_calls=0,failed_calls=0
cmdstat_set:calls=35,usec=382,usec_per_call=10.91,rejected_calls=0,failed_calls=0
cmdstat_zrank:calls=18,usec=57,usec_per_call=3.17,rejected_calls=0,failed_calls=0
cmdstat_zcard:calls=18,usec=37,usec_per_call=2.06,rejected_calls=0,failed_calls=0
cmdstat_pfcount:calls=63,usec=31947,usec_per_call=507.10,rejected_calls=0,failed_calls=0
cmdstat_zscore:calls=22,usec=17011,usec_per_call=773.23,rejected_calls=0,failed_calls=0
cmdstat_zrevrange:calls=7,usec=20706,usec_per_call=2958.00,rejected_calls=0,failed_calls=0
cmdstat_unlink:calls=6846,usec=34093,usec_per_call=4.98,rejected_calls=0,failed_calls=0
cmdstat_echo:calls=4,usec=5,usec_per_call=1.25,rejected_calls=0,failed_calls=0
cmdstat_get:calls=96871,usec=3046990,usec_per_call=31.45,rejected_calls=0,failed_calls=0
# Errorstats
errorstat_ERR:count=4
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=2557844,expires=2557580,avg_ttl=5724072069
------ CLIENT LIST OUTPUT ------
id=3 addr=127.0.0.1:52701 laddr=127.0.0.1:6379 fd=8 name=89FE2A3 age=27449 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61480 events=r cmd=setex user=default redir=-1
id=4 addr=127.0.0.1:52702 laddr=127.0.0.1:6379 fd=9 name=89FE2A3 age=27449 idle=5 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=-1
id=5 addr=127.0.0.1:52711 laddr=127.0.0.1:6379 fd=10 name=89FE2A3 age=27415 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20528 events=r cmd=setex user=default redir=-1
id=6 addr=127.0.0.1:52714 laddr=127.0.0.1:6379 fd=11 name=89FE2A3 age=27415 idle=43 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=-1
------ MODULES INFO OUTPUT ------
------ FAST MEMORY TEST ------
*** Preparing to test memory region 7f2393000000 (23280484352 bytes)
*** Preparing to test memory region 7f28feb6a000 (4096 bytes)
*** Preparing to test memory region 7f28feb70000 (301989888 bytes)
*** Preparing to test memory region 7f2910b71000 (8388608 bytes)
*** Preparing to test memory region 7f2911381000 (8388608 bytes)
*** Preparing to test memory region 7f2911b91000 (8388608 bytes)
*** Preparing to test memory region 7f29123a1000 (8388608 bytes)
*** Preparing to test memory region 7f2913000000 (8404992 bytes)
*** Preparing to test memory region 7f2913810000 (8192 bytes)
*** Preparing to test memory region 7f2913842000 (4096 bytes)
*** Preparing to test memory region 7f2913850000 (8192 bytes)
*** Preparing to test memory region 7f2913979000 (20480 bytes)
*** Preparing to test memory region 7f29139a0000 (4096 bytes)
*** Preparing to test memory region 7f29139d6000 (4096 bytes)
*** Preparing to test memory region 7f29139ea000 (4096 bytes)
*** Preparing to test memory region 7f2913bdb000 (12288 bytes)
*** Preparing to test memory region 7f2913bde000 (16384 bytes)
*** Preparing to test memory region 7f2913bf0000 (8192 bytes)
*** Preparing to test memory region 7f2913c1b000 (4096 bytes)
*** Preparing to test memory region 7f2913c1c000 (16384 bytes)
*** Preparing to test memory region 7f2913ef0000 (8192 bytes)
*** Preparing to test memory region 7f2913ef2000 (16384 bytes)
*** Preparing to test memory region 7f2913f8f000 (16384 bytes)
*** Preparing to test memory region 7f291404b000 (4096 bytes)
*** Preparing to test memory region 7f291404c000 (4096 bytes)
*** Preparing to test memory region 7f2914055000 (4096 bytes)
*** Preparing to test memory region 7f29141af000 (4096 bytes)
*** Preparing to test memory region 7f29141de000 (4096 bytes)
*** Preparing to test memory region 7f29141df000 (12288 bytes)
*** Preparing to test memory region 7f29143f1000 (49152 bytes)
*** Preparing to test memory region 7f29143fd000 (2281472 bytes)
*** Preparing to test memory region 7fffe838c000 (270336 bytes)
.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O
Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.
------ DUMPING CODE AROUND EIP ------
Symbol: lzf_compress (base: 0x7f2914245e60)
Module: redis-rdb-bgsave 127.0.0.1:6379 (base 0x7f29141eb000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x7f2914245e60 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
4210:C 15 Oct 2021 16:04:20.807 # dump of function (hexdump of 416 bytes):
f30f1efa415741564155415455534c8d9c240000fcff4881ec0010000048830c24004c39dc75ef4883ec2864488b042528000000488984241800040031c085f60f84da01000085c90f84d2010000440fb72789f689c94c8d42014c8d1c374c8d140a6641c1c408498d5bfe450fb7e44839fb0f867f0200004889f84531ed488d6c241041be08010000eb390f1f4400004d39c20f8687010000488d70010fb6004183c5014d8d48014188004183fd200f849b0100004839f30f86aa0100004d89c84889f04489e1440fb660024989c7c1e1084929ff4589e14109ccc1e908438d34a429f10fb7c9488d748d008b0e44893e4889c64801f94829ce4883ee014881feff1f000077894839cf7384443a49020f857affffff440fb708664439090f856cffffff4d89d94d8d60044929c14183e9024d39e20f86cd0000004589ec458d7dff41f7dc4d63e447887c20ff4531e44585ed410f94c44d29e04183f9100f86c6010000440fb67803443879030f8463010000b90300000041bf0200000041bd200000004989f44d8d480149c1ec084501e5458828418831488d34084983c102
=== REDIS BUG REPORT END. Make sure to include from START to END. ===
Additional information
- OS distribution and version
- Windows Server 2019 version 1809 build 17763.2213 running Redis on WSL1
- Output of
uname -afrom WSL:Linux 89FE2A3 4.4.0-17763-Microsoft #2183-Microsoft Sat Sep 11 15:52:00 PST 2021 x86_64 x86_64 x86_64 GNU/Linux - Steps to reproduce (if any) - very difficult to reproduce, happens after about 10 hours of runtime.
Comment From: oranagra
@paviad do you have any idea how to to reproduce that crash? Did redis used to work for you in that configuration (WSL) in the past? Did you build redis yourself or downloaded it in binary form?
Comment From: paviad
- I do not know how to reproduce the crash - it happens randomly, and rarely
- Yes I had redis 5.0.7 working on WSL1 on my previous machine with the same operation profile for about 2 years, with absolutely no issues. After we've migrated to the new machine which is identical to the previous, except smaller disk, (1TB vs 4TB) and less memory (64GB vs 192GB) we encountered these crashes on 5.0.7, so we upgrades to 6.0.6 and then to 6.2.6 with the same phenomenon (rare crashes). The old machine had more memory because it also used to host our MySQL database, but this new machine does not, we have a separate machine for that now.
- I did not build myself, I used ubuntu's "apt" to get version 6.2.6 following the instructions on the redis website.
On Sat, Oct 16, 2021 at 6:18 PM Oran Agra @.***> wrote:
@paviad https://github.com/paviad do you have any idea how to to reproduce that crash? Did redis used to work for you in that configuration (WSL) in the past? Did you build redis yourself or downloaded it in binary form?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/redis/redis/issues/9635#issuecomment-944931468, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJNV2H4GUWRHCUZGRVWKQLUHGJTNANCNFSM5GCLGTXQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Comment From: oranagra
Thank you for the info. The process that crashes is the BGSAVE child, I'm curious to know if the next bgsave succeeds? or is that parent process doomed from that moment on?
Comment From: paviad
One instance was the bg save, and in that instance, redis did continue. However other instances are not the bg save but rather the main threads, and in those cases redis crashed and stayed down.
On Sat, Oct 16, 2021 at 10:29 PM Oran Agra @.***> wrote:
Thank you for the info. The process that crashes is the BGSAVE child, I'm curious to know if the next bgsave succeeds? or is that parent process doomed from that moment on?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/redis/redis/issues/9635#issuecomment-944991784, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJNV2B2M7K4QCBT5X2VA2TUHHHB7ANCNFSM5GCLGTXQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Comment From: paviad
Here's the start of another crash (not the bg thread) - this is from 6.0.6 tho:
=== REDIS BUG REPORT START: Cut & paste starting from here ===
2029:M 14 Oct 2021 19:15:03.083 # Redis 6.0.6 crashed by signal: 11
2029:M 14 Oct 2021 19:15:03.083 # Crashed running the instruction at: 0x7fbe13f6e533
2029:M 14 Oct 2021 19:15:03.084 # Accessing address: 0x7fb9122f3000
2029:M 14 Oct 2021 19:15:03.084 # Failed assertion: <no assertion failed> (<no file>:0)
------ STACK TRACE ------
EIP:
/lib/x86_64-linux-gnu/libc.so.6(+0x18e533)[0x7fbe13f6e533]
Backtrace:
redis-server 127.0.0.1:6379(logStackTrace+0x4f)[0x7fbe145aebcf]
redis-server 127.0.0.1:6379(sigsegvHandler+0xb5)[0x7fbe145af3d5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fbe13ff23c0]
/lib/x86_64-linux-gnu/libc.so.6(+0x18e533)[0x7fbe13f6e533]
redis-server 127.0.0.1:6379(sdsMakeRoomFor+0xef)[0x7fbe1456c24f]
redis-server 127.0.0.1:6379(processMultibulkBuffer+0x484)[0x7fbe14577ad4]
redis-server 127.0.0.1:6379(processInputBuffer+0x16a)[0x7fbe1457a37a]
redis-server 127.0.0.1:6379(+0xe10ac)[0x7fbe145ff0ac]
redis-server 127.0.0.1:6379(aeProcessEvents+0x303)[0x7fbe1455fb83]
redis-server 127.0.0.1:6379(aeMain+0x1d)[0x7fbe1455febd]
redis-server 127.0.0.1:6379(main+0x4e5)[0x7fbe1455c3d5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fbe13e070b3]
redis-server 127.0.0.1:6379(_start+0x2e)[0x7fbe1455c6ae]
------ INFO OUTPUT ------
...
Comment From: paviad
Another bg save crash just happened:
=== REDIS BUG REPORT START: Cut & paste starting from here ===
5186:C 16 Oct 2021 20:10:29.221 # Redis 6.2.6 crashed by signal: 11, si_code: 2
5186:C 16 Oct 2021 20:10:29.225 # Accessing address: 0x7f8434d46000
5186:C 16 Oct 2021 20:10:29.225 # Crashed running the instruction at: 0x7f8969f68fa8
------ STACK TRACE ------
EIP:
redis-rdb-bgsave 127.0.0.1:6379(lzf_compress+0x148)[0x7f8969f68fa8]
Backtrace:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7f89699323c0]
redis-rdb-bgsave 127.0.0.1:6379(lzf_compress+0x148)[0x7f8969f68fa8]
redis-rdb-bgsave 127.0.0.1:6379(+0x7dbb9)[0x7f8969f8bbb9]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveRawString+0x105)[0x7f8969f8bfe5]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveObject+0x1cd)[0x7f8969f8cffd]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveKeyValuePair+0x89)[0x7f8969f8d759]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveRio+0x248)[0x7f8969f8ded8]
redis-rdb-bgsave 127.0.0.1:6379(rdbSave+0xc3)[0x7f8969f90933]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveBackground+0xc3)[0x7f8969f90c53]
redis-rdb-bgsave 127.0.0.1:6379(serverCron+0x5e0)[0x7f8969f64ed0]
redis-rdb-bgsave 127.0.0.1:6379(aeProcessEvents+0x129)[0x7f8969f56cb9]
redis-rdb-bgsave 127.0.0.1:6379(aeMain+0x1d)[0x7f8969f570ed]
redis-rdb-bgsave 127.0.0.1:6379(main+0x337)[0x7f8969f534b7]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f89697370b3]
redis-rdb-bgsave 127.0.0.1:6379(_start+0x2e)[0x7f8969f539fe]
------ REGISTERS ------
5186:C 16 Oct 2021 20:10:30.005 #
RAX:00007f843478f6b6 RBX:00007f8434908dd1
RCX:00007f843478f57f RDX:00007f8434cc8480
RDI:00007f8434649b89 RSI:0000000000000136
RBP:00007fffda67ed70 RSP:00007fffda67ed60
R8 :00007f8434d46001 R9 :000000000017971b
R10:00007f8434f876c6 R11:00007f8434908dd3
R12:0000000000000000 R13:0000000000000000
R14:0000000000000108 R15:00000000ffffffff
RIP:00007f8969f68fa8 EFL:0000000000010246
CSGSFS:00000053002b0033
5186:C 16 Oct 2021 20:10:30.005 # (00007fffda67ed6f) -> 00000d0200000ed6
5186:C 16 Oct 2021 20:10:30.005 # (00007fffda67ed6e) -> 0002ca260000050a
5186:C 16 Oct 2021 20:10:30.006 # (00007fffda67ed6d) -> 000006c000000922
5186:C 16 Oct 2021 20:10:30.006 # (00007fffda67ed6c) -> 0000067800000442
5186:C 16 Oct 2021 20:10:30.006 # (00007fffda67ed6b) -> 000019dd000003d7
5186:C 16 Oct 2021 20:10:30.006 # (00007fffda67ed6a) -> 00000d760000019d
5186:C 16 Oct 2021 20:10:30.007 # (00007fffda67ed69) -> 0000092000001879
5186:C 16 Oct 2021 20:10:30.007 # (00007fffda67ed68) -> 000008ae0000188e
5186:C 16 Oct 2021 20:10:30.008 # (00007fffda67ed67) -> 00000b33000016b0
5186:C 16 Oct 2021 20:10:30.008 # (00007fffda67ed66) -> 000018fd00000cd2
5186:C 16 Oct 2021 20:10:30.008 # (00007fffda67ed65) -> 000000820000075b
5186:C 16 Oct 2021 20:10:30.009 # (00007fffda67ed64) -> 00062938000018f7
5186:C 16 Oct 2021 20:10:30.009 # (00007fffda67ed63) -> 000001830000160f
5186:C 16 Oct 2021 20:10:30.009 # (00007fffda67ed62) -> 000006ae00000ba0
5186:C 16 Oct 2021 20:10:30.010 # (00007fffda67ed61) -> 000000000000039f
5186:C 16 Oct 2021 20:10:30.010 # (00007fffda67ed60) -> 0000000000000000
------ INFO OUTPUT ------
# Server
redis_version:6.2.6
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:e15ab29abece34d
redis_mode:standalone
os:Linux 4.4.0-17763-Microsoft x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:9.3.0
process_id:5186
process_supervised:no
run_id:dcdffab5dbf1d5cd919405b5c36c46848c583285
tcp_port:6379
server_time_usec:1634414840060874
uptime_in_seconds:52570
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:7024888
executable:/mnt/d/redis/redis-server
config_file:/mnt/d/redis/./redis.conf
io_threads_active:0
# Clients
connected_clients:4
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:81936
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
# Memory
used_memory:16387230680
used_memory_human:15.26G
used_memory_rss:12710768640
used_memory_rss_human:11.84G
used_memory_peak:16591241368
used_memory_peak_human:15.45G
used_memory_peak_perc:98.77%
used_memory_overhead:231644400
used_memory_startup:810024
used_memory_dataset:16155586280
used_memory_dataset_perc:98.59%
allocator_allocated:16384211384
allocator_active:18759921664
allocator_resident:18923114496
total_system_memory:68541775872
total_system_memory_human:63.83G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.14
allocator_frag_bytes:2375710280
allocator_rss_ratio:1.01
allocator_rss_bytes:163192832
rss_overhead_ratio:0.67
rss_overhead_bytes:-6212345856
mem_fragmentation_ratio:0.78
mem_fragmentation_bytes:-3673314504
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:122976
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:204323
rdb_bgsave_in_progress:0
rdb_last_save_time:1634414539
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:226
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:0
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
# Stats
total_connections_received:21
total_commands_processed:25190490
instantaneous_ops_per_sec:446
total_net_input_bytes:87120470809
total_net_output_bytes:9803387365
instantaneous_input_kbps:769.47
instantaneous_output_kbps:2.17
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:45214
expired_stale_perc:0.55
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:4142
evicted_keys:0
keyspace_hits:439060
keyspace_misses:74292
pubsub_channels:1
pubsub_patterns:0
latest_fork_usec:1672717
total_forks:100
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:33
dump_payload_sanitizations:0
total_reads_processed:24860279
total_writes_processed:22233680
io_threaded_reads_processed:0
io_threaded_writes_processed:0
# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:148084b54627a7904663bacff598f3a41455676a
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:20.781250
used_cpu_user:39.546875
used_cpu_sys_children:0.000000
used_cpu_user_children:0.000000
used_cpu_sys_main_thread:20.781250
used_cpu_user_main_thread:39.546875
# Modules
# Commandstats
cmdstat_config:calls=31,usec=327,usec_per_call=10.55,rejected_calls=0,failed_calls=0
cmdstat_setex:calls=24570910,usec=252413332,usec_per_call=10.27,rejected_calls=0,failed_calls=0
cmdstat_exists:calls=289096,usec=1275125,usec_per_call=4.41,rejected_calls=0,failed_calls=0
cmdstat_info:calls=1750,usec=53563,usec_per_call=30.61,rejected_calls=0,failed_calls=0
cmdstat_unlink:calls=13744,usec=93923,usec_per_call=6.83,rejected_calls=0,failed_calls=0
cmdstat_echo:calls=6,usec=5,usec_per_call=0.83,rejected_calls=13,failed_calls=0
cmdstat_expire:calls=7591,usec=42253,usec_per_call=5.57,rejected_calls=0,failed_calls=0
cmdstat_pttl:calls=8,usec=37,usec_per_call=4.62,rejected_calls=0,failed_calls=0
cmdstat_subscribe:calls=10,usec=20,usec_per_call=2.00,rejected_calls=0,failed_calls=0
cmdstat_zrank:calls=9,usec=39,usec_per_call=4.33,rejected_calls=0,failed_calls=0
cmdstat_pfcount:calls=27,usec=158,usec_per_call=5.85,rejected_calls=0,failed_calls=0
cmdstat_psetex:calls=5,usec=64,usec_per_call=12.80,rejected_calls=0,failed_calls=0
cmdstat_zscore:calls=18,usec=202,usec_per_call=11.22,rejected_calls=0,failed_calls=0
cmdstat_set:calls=124,usec=988,usec_per_call=7.97,rejected_calls=0,failed_calls=0
cmdstat_incr:calls=81176,usec=719041,usec_per_call=8.86,rejected_calls=0,failed_calls=0
cmdstat_client:calls=20,usec=35,usec_per_call=1.75,rejected_calls=0,failed_calls=0
cmdstat_cluster:calls=4,usec=24,usec_per_call=6.00,rejected_calls=6,failed_calls=4
cmdstat_zrevrange:calls=9,usec=224,usec_per_call=24.89,rejected_calls=0,failed_calls=0
cmdstat_command:calls=1,usec=496,usec_per_call=496.00,rejected_calls=0,failed_calls=0
cmdstat_ping:calls=1739,usec=8090,usec_per_call=4.65,rejected_calls=0,failed_calls=0
cmdstat_get:calls=224203,usec=7055525,usec_per_call=31.47,rejected_calls=0,failed_calls=0
cmdstat_zcard:calls=9,usec=23,usec_per_call=2.56,rejected_calls=0,failed_calls=0
# Errorstats
errorstat_ERR:count=14
errorstat_LOADING:count=19
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=2556389,expires=2556124,avg_ttl=5722189146
------ CLIENT LIST OUTPUT ------
id=18 addr=127.0.0.1:55395 laddr=127.0.0.1:6379 fd=9 name=89FE2A3 age=52484 idle=18 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=-1
id=19 addr=127.0.0.1:55418 laddr=127.0.0.1:6379 fd=10 name=89FE2A3 age=52469 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=5 oll=0 omem=0 tot-mem=61480 events=r cmd=setex user=default redir=-1
id=20 addr=127.0.0.1:55419 laddr=127.0.0.1:6379 fd=11 name=89FE2A3 age=52469 idle=7 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=-1
id=23 addr=127.0.0.1:57165 laddr=127.0.0.1:6379 fd=8 name=89FE2A3 age=829 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=unlink user=default redir=-1
------ MODULES INFO OUTPUT ------
------ FAST MEMORY TEST ------
*** Preparing to test memory region 7f83e8c00000 (22928162816 bytes)
*** Preparing to test memory region 7f893f770000 (654311424 bytes)
*** Preparing to test memory region 7f8966771000 (8388608 bytes)
*** Preparing to test memory region 7f8966f81000 (8388608 bytes)
*** Preparing to test memory region 7f8967791000 (8388608 bytes)
*** Preparing to test memory region 7f8967fa1000 (8388608 bytes)
*** Preparing to test memory region 7f8968c00000 (8388608 bytes)
*** Preparing to test memory region 7f896951a000 (4096 bytes)
*** Preparing to test memory region 7f8969520000 (16384 bytes)
*** Preparing to test memory region 7f8969530000 (8192 bytes)
*** Preparing to test memory region 7f8969562000 (4096 bytes)
*** Preparing to test memory region 7f8969570000 (8192 bytes)
*** Preparing to test memory region 7f8969699000 (20480 bytes)
*** Preparing to test memory region 7f89696c0000 (4096 bytes)
*** Preparing to test memory region 7f89696f6000 (4096 bytes)
*** Preparing to test memory region 7f896970a000 (4096 bytes)
*** Preparing to test memory region 7f89698fb000 (12288 bytes)
*** Preparing to test memory region 7f89698fe000 (16384 bytes)
*** Preparing to test memory region 7f8969910000 (8192 bytes)
*** Preparing to test memory region 7f896993b000 (4096 bytes)
*** Preparing to test memory region 7f896993c000 (16384 bytes)
*** Preparing to test memory region 7f8969c10000 (8192 bytes)
*** Preparing to test memory region 7f8969c12000 (16384 bytes)
*** Preparing to test memory region 7f8969caf000 (16384 bytes)
*** Preparing to test memory region 7f8969d6b000 (4096 bytes)
*** Preparing to test memory region 7f8969d6c000 (4096 bytes)
*** Preparing to test memory region 7f8969d75000 (4096 bytes)
*** Preparing to test memory region 7f8969ecf000 (4096 bytes)
*** Preparing to test memory region 7f8969efe000 (4096 bytes)
*** Preparing to test memory region 7f8969eff000 (12288 bytes)
*** Preparing to test memory region 7f896a114000 (49152 bytes)
*** Preparing to test memory region 7f896a120000 (2281472 bytes)
*** Preparing to test memory region 7fffd23d3000 (270336 bytes)
.5055:M 16 Oct 2021 20:10:50.879 # Background saving terminated by signal 11
5055:M 16 Oct 2021 20:10:51.309 * 10 changes in 300 seconds. Saving...
5055:M 16 Oct 2021 20:10:53.046 * Background saving started by pid 5187
5187:C 16 Oct 2021 20:15:11.478 * DB saved on disk
5055:M 16 Oct 2021 20:15:12.500 * Background saving terminated with success
Comment From: paviad
Not sure if related, but I am also occasionally seeing lines such as this in the log:
5055:M 16 Oct 2021 18:44:38.778 # Can't save in background: fork: Invalid argument
But these lines are not accompanied by crashes
Comment From: oranagra
I know that when an instance crashes in bgsave the instance keeps running. I'm asking if the next time that instance does bgsave, it crashes again, or does it succeed?
Comment From: paviad
It does succeed, redis apparently recovers from these crashes.
Here's another dump that just happened, at a different EIP:
=== REDIS BUG REPORT START: Cut & paste starting from here ===
5193:C 16 Oct 2021 21:30:33.971 # Redis 6.2.6 crashed by signal: 11, si_code: 2
5193:C 16 Oct 2021 21:30:33.972 # Accessing address: 0x7f858b630470
5193:C 16 Oct 2021 21:30:33.972 # Crashed running the instruction at: 0x7f896a075ae1
------ STACK TRACE ------
EIP:
redis-rdb-bgsave 127.0.0.1:6379(je_extent_heap_insert+0x31)[0x7f896a075ae1]
Backtrace:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7f89699323c0]
redis-rdb-bgsave 127.0.0.1:6379(je_extent_heap_insert+0x31)[0x7f896a075ae1]
redis-rdb-bgsave 127.0.0.1:6379(+0x1375f9)[0x7f896a0455f9]
redis-rdb-bgsave 127.0.0.1:6379(+0x139370)[0x7f896a047370]
redis-rdb-bgsave 127.0.0.1:6379(je_tcache_bin_flush_small+0x535)[0x7f896a09c6d5]
redis-rdb-bgsave 127.0.0.1:6379(je_tcache_event_hard+0x4e)[0x7f896a09dc8e]
redis-rdb-bgsave 127.0.0.1:6379(je_malloc+0x2d5)[0x7f896a03d9d5]
redis-rdb-bgsave 127.0.0.1:6379(ztrymalloc_usable+0x1f)[0x7f8969f6854f]
redis-rdb-bgsave 127.0.0.1:6379(zmalloc+0x13)[0x7f8969f68593]
redis-rdb-bgsave 127.0.0.1:6379(+0x7db9e)[0x7f8969f8bb9e]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveRawString+0x105)[0x7f8969f8bfe5]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveObject+0x1cd)[0x7f8969f8cffd]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveKeyValuePair+0x89)[0x7f8969f8d759]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveRio+0x248)[0x7f8969f8ded8]
redis-rdb-bgsave 127.0.0.1:6379(rdbSave+0xc3)[0x7f8969f90933]
redis-rdb-bgsave 127.0.0.1:6379(rdbSaveBackground+0xc3)[0x7f8969f90c53]
redis-rdb-bgsave 127.0.0.1:6379(serverCron+0x5e0)[0x7f8969f64ed0]
redis-rdb-bgsave 127.0.0.1:6379(aeProcessEvents+0x129)[0x7f8969f56cb9]
redis-rdb-bgsave 127.0.0.1:6379(aeMain+0x1d)[0x7f8969f570ed]
redis-rdb-bgsave 127.0.0.1:6379(main+0x337)[0x7f8969f534b7]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f89697370b3]
redis-rdb-bgsave 127.0.0.1:6379(_start+0x2e)[0x7f8969f539fe]
------ REGISTERS ------
5193:C 16 Oct 2021 21:30:34.374 #
RAX:00007f858b630440 RBX:00007f8969205700
RCX:0000000000006001 RDX:0000000000000000
RDI:00007f8969205770 RSI:00007f862d5a42c0
RBP:00007f8969208540 RSP:00007fffda6bea58
R8 :0000000000000018 R9 :00007f896a0cf420
R10:000000000434f000 R11:0000000000000001
R12:00007f8969208540 R13:0000000000000d68
R14:00000000000001ad R15:00007f8969205700
RIP:00007f896a075ae1 EFL:0000000000010246
CSGSFS:00000053002b0033
5193:C 16 Oct 2021 21:30:34.374 # (00007fffda6bea67) -> 000000e669202638
5193:C 16 Oct 2021 21:30:34.374 # (00007fffda6bea66) -> 00007f8969208540
5193:C 16 Oct 2021 21:30:34.375 # (00007fffda6bea65) -> 00007f862d5a42c0
5193:C 16 Oct 2021 21:30:34.375 # (00007fffda6bea64) -> 00007f896a09c6d5
5193:C 16 Oct 2021 21:30:34.375 # (00007fffda6bea63) -> 00007fffda6beac0
5193:C 16 Oct 2021 21:30:34.376 # (00007fffda6bea62) -> 00007f89692009c0
5193:C 16 Oct 2021 21:30:34.376 # (00007fffda6bea61) -> 00007f8969520ab8
5193:C 16 Oct 2021 21:30:34.376 # (00007fffda6bea60) -> 0000000000000000
5193:C 16 Oct 2021 21:30:34.376 # (00007fffda6bea5f) -> 00007fffda6becf0
5193:C 16 Oct 2021 21:30:34.377 # (00007fffda6bea5e) -> 0000000000000010
5193:C 16 Oct 2021 21:30:34.377 # (00007fffda6bea5d) -> 00001000da6bead0
5193:C 16 Oct 2021 21:30:34.377 # (00007fffda6bea5c) -> 00007f896a047370
5193:C 16 Oct 2021 21:30:34.377 # (00007fffda6bea5b) -> 00007f89692009c0
5193:C 16 Oct 2021 21:30:34.378 # (00007fffda6bea5a) -> 000000000000000d
5193:C 16 Oct 2021 21:30:34.378 # (00007fffda6bea59) -> 00007f89692009c0
5193:C 16 Oct 2021 21:30:34.378 # (00007fffda6bea58) -> 00007f896a0455f9
------ INFO OUTPUT ------
# Server
redis_version:6.2.6
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:e15ab29abece34d
redis_mode:standalone
os:Linux 4.4.0-17763-Microsoft x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:9.3.0
process_id:5193
process_supervised:no
run_id:dcdffab5dbf1d5cd919405b5c36c46848c583285
tcp_port:6379
server_time_usec:1634419829054708
uptime_in_seconds:57559
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:7029877
executable:/mnt/d/redis/redis-server
config_file:/mnt/d/redis/./redis.conf
io_threads_active:0
# Clients
connected_clients:4
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:81960
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
# Memory
used_memory:16431026512
used_memory_human:15.30G
used_memory_rss:12620914688
used_memory_rss_human:11.75G
used_memory_peak:16591241368
used_memory_peak_human:15.45G
used_memory_peak_perc:99.03%
used_memory_overhead:231683272
used_memory_startup:810024
used_memory_dataset:16199343240
used_memory_dataset_perc:98.59%
allocator_allocated:16431327160
allocator_active:18808102912
allocator_resident:18973483008
total_system_memory:68541775872
total_system_memory_human:63.83G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.14
allocator_frag_bytes:2376775752
allocator_rss_ratio:1.01
allocator_rss_bytes:165380096
rss_overhead_ratio:0.67
rss_overhead_bytes:-6352568320
mem_fragmentation_ratio:0.77
mem_fragmentation_bytes:-3810110016
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:123000
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:1118676
rdb_bgsave_in_progress:0
rdb_last_save_time:1634419528
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:2323
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:0
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
# Stats
total_connections_received:21
total_commands_processed:27346677
instantaneous_ops_per_sec:89
total_net_input_bytes:94897109892
total_net_output_bytes:11052171097
instantaneous_input_kbps:423.47
instantaneous_output_kbps:44.16
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:49849
expired_stale_perc:0.24
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:4861
evicted_keys:0
keyspace_hits:491827
keyspace_misses:81891
pubsub_channels:1
pubsub_patterns:0
latest_fork_usec:1167978
total_forks:107
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:33
dump_payload_sanitizations:0
total_reads_processed:26887319
total_writes_processed:24021352
io_threaded_reads_processed:0
io_threaded_writes_processed:0
# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:148084b54627a7904663bacff598f3a41455676a
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:1.078125
used_cpu_user:0.203125
used_cpu_sys_children:0.000000
used_cpu_user_children:0.000000
used_cpu_sys_main_thread:1.078125
used_cpu_user_main_thread:0.203125
# Modules
# Commandstats
cmdstat_config:calls=31,usec=327,usec_per_call=10.55,rejected_calls=0,failed_calls=0
cmdstat_setex:calls=26655065,usec=280693224,usec_per_call=10.53,rejected_calls=0,failed_calls=0
cmdstat_exists:calls=322645,usec=1425910,usec_per_call=4.42,rejected_calls=0,failed_calls=0
cmdstat_info:calls=1914,usec=57975,usec_per_call=30.29,rejected_calls=0,failed_calls=0
cmdstat_unlink:calls=14987,usec=103452,usec_per_call=6.90,rejected_calls=0,failed_calls=0
cmdstat_echo:calls=6,usec=5,usec_per_call=0.83,rejected_calls=13,failed_calls=0
cmdstat_expire:calls=8527,usec=48692,usec_per_call=5.71,rejected_calls=0,failed_calls=0
cmdstat_pttl:calls=8,usec=37,usec_per_call=4.62,rejected_calls=0,failed_calls=0
cmdstat_subscribe:calls=10,usec=20,usec_per_call=2.00,rejected_calls=0,failed_calls=0
cmdstat_zrank:calls=9,usec=39,usec_per_call=4.33,rejected_calls=0,failed_calls=0
cmdstat_pfcount:calls=27,usec=158,usec_per_call=5.85,rejected_calls=0,failed_calls=0
cmdstat_psetex:calls=5,usec=64,usec_per_call=12.80,rejected_calls=0,failed_calls=0
cmdstat_zscore:calls=18,usec=202,usec_per_call=11.22,rejected_calls=0,failed_calls=0
cmdstat_set:calls=137,usec=1103,usec_per_call=8.05,rejected_calls=0,failed_calls=0
cmdstat_incr:calls=90322,usec=786877,usec_per_call=8.71,rejected_calls=0,failed_calls=0
cmdstat_client:calls=20,usec=35,usec_per_call=1.75,rejected_calls=0,failed_calls=0
cmdstat_cluster:calls=4,usec=24,usec_per_call=6.00,rejected_calls=6,failed_calls=4
cmdstat_zrevrange:calls=9,usec=224,usec_per_call=24.89,rejected_calls=0,failed_calls=0
cmdstat_command:calls=1,usec=496,usec_per_call=496.00,rejected_calls=0,failed_calls=0
cmdstat_ping:calls=1903,usec=8413,usec_per_call=4.42,rejected_calls=0,failed_calls=0
cmdstat_get:calls=251020,usec=8139769,usec_per_call=32.43,rejected_calls=0,failed_calls=0
cmdstat_zcard:calls=9,usec=23,usec_per_call=2.56,rejected_calls=0,failed_calls=0
# Errorstats
errorstat_ERR:count=14
errorstat_LOADING:count=19
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=2556996,expires=2556731,avg_ttl=5694183420
------ CLIENT LIST OUTPUT ------
id=18 addr=127.0.0.1:55395 laddr=127.0.0.1:6379 fd=9 name=89FE2A3 age=57473 idle=41 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=-1
id=19 addr=127.0.0.1:55418 laddr=127.0.0.1:6379 fd=10 name=89FE2A3 age=57458 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61480 events=r cmd=setex user=default redir=-1
id=20 addr=127.0.0.1:55419 laddr=127.0.0.1:6379 fd=11 name=89FE2A3 age=57458 idle=46 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=ping user=default redir=-1
id=23 addr=127.0.0.1:57165 laddr=127.0.0.1:6379 fd=8 name=89FE2A3 age=5818 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20528 events=r cmd=setex user=default redir=-1
------ MODULES INFO OUTPUT ------
------ FAST MEMORY TEST ------
*** Preparing to test memory region 7f83e4000000 (135168 bytes)
*** Preparing to test memory region 7f83e81f1000 (8388608 bytes)
*** Preparing to test memory region 7f83e8a00000 (22930259968 bytes)
*** Preparing to test memory region 7f893f770000 (654311424 bytes)
*** Preparing to test memory region 7f8966771000 (8388608 bytes)
*** Preparing to test memory region 7f8966f81000 (8388608 bytes)
*** Preparing to test memory region 7f8967791000 (8388608 bytes)
*** Preparing to test memory region 7f8967fa1000 (8388608 bytes)
*** Preparing to test memory region 7f8968c00000 (8388608 bytes)
*** Preparing to test memory region 7f896951a000 (4096 bytes)
*** Preparing to test memory region 7f8969520000 (16384 bytes)
*** Preparing to test memory region 7f8969530000 (8192 bytes)
*** Preparing to test memory region 7f8969562000 (4096 bytes)
*** Preparing to test memory region 7f8969570000 (8192 bytes)
*** Preparing to test memory region 7f8969699000 (20480 bytes)
*** Preparing to test memory region 7f89696c0000 (4096 bytes)
*** Preparing to test memory region 7f89696f6000 (4096 bytes)
*** Preparing to test memory region 7f896970a000 (4096 bytes)
*** Preparing to test memory region 7f89698fb000 (12288 bytes)
*** Preparing to test memory region 7f89698fe000 (16384 bytes)
*** Preparing to test memory region 7f8969910000 (8192 bytes)
*** Preparing to test memory region 7f896993b000 (4096 bytes)
*** Preparing to test memory region 7f896993c000 (16384 bytes)
*** Preparing to test memory region 7f8969c10000 (8192 bytes)
*** Preparing to test memory region 7f8969c12000 (16384 bytes)
*** Preparing to test memory region 7f8969caf000 (16384 bytes)
*** Preparing to test memory region 7f8969d6b000 (4096 bytes)
*** Preparing to test memory region 7f8969d6c000 (4096 bytes)
*** Preparing to test memory region 7f8969d75000 (4096 bytes)
*** Preparing to test memory region 7f8969ecf000 (4096 bytes)
*** Preparing to test memory region 7f8969efe000 (4096 bytes)
*** Preparing to test memory region 7f8969eff000 (12288 bytes)
*** Preparing to test memory region 7f896a114000 (49152 bytes)
*** Preparing to test memory region 7f896a120000 (2281472 bytes)
*** Preparing to test memory region 7fffd23d3000 (270336 bytes)
.O.O.5055:M 16 Oct 2021 21:30:49.401 # Background saving terminated by signal 11
5055:M 16 Oct 2021 21:30:49.521 * 10 changes in 300 seconds. Saving...
5055:M 16 Oct 2021 21:30:50.337 # Can't save in background: fork: Invalid argument
5055:M 16 Oct 2021 21:30:55.048 * 10 changes in 300 seconds. Saving...
5055:M 16 Oct 2021 21:30:56.644 * Background saving started by pid 5194
5194:C 16 Oct 2021 21:46:16.377 * DB saved on disk
Comment From: oranagra
There seems to be some kind of heap corruption that's causing all of these. It's very odd to me that you say it was running ok for years with 5.0.7, and started happening when you changed hardware. I would have suggested that maybe the old HW had ECC and the one doesn't, but the high rate of errors doesn't seem likely to me to be caused by such issues to me.
The other possible explanation of heap corruption is.... heap corruption (or other type of data corruption). i.e. even if the code is bug free, if the data (specifically ziplists) has some corruption it can cause redis to corrupt it's own memory. so maybe the data experienced some kind of corruption on the upgrade?
bottom line i think the best chance we have to figure this out is understand what's the difference between the time all was good and the time when these issues started.
Comment From: paviad
I will do the following:
1. I will run a process that reads all keys and writes them again to another instance
2. I will let redis-server --test-memory run for about 2 hours
I will report here with the results. Regarding the fork message - is that anything to worry about?
Comment From: paviad
Ok some results - the memory test was run on 40GB and completed 50 passes without issue
The fresh store is now running about 21 hours, it is half the size of the original (so there was a lot of dead weight there) - and for the first 20 hours it ran without issue, then these started appearing:
111:M 19 Oct 2021 03:42:01.391 # Can't save in background: fork: Invalid argument
111:M 19 Oct 2021 03:56:08.413 # Can't save in background: fork: Invalid argument
111:M 19 Oct 2021 04:25:20.496 # Can't save in background: fork: Cannot allocate memory
Redis recovered from them without a crash or a dump in the log, and does manage to save again after these.
Coincidentally, these happened very close to when the new RDB passed 2GB of size.
This is the memory utilization picture on the node around the time of the last line:
Comment From: oranagra
i see the fork man doesn't list EINVAL as a possible outcome, so maybe we managed to corrupt errno before printing it.
can you please check if your overcommit setting is enabled in the kernel?
cat /proc/sys/vm/overcommit_memory
in any case these don't seem to be related in any way to the crashes and memory corruptions you had.
Comment From: paviad
Overcommit is not enabled... should it be enabled?
Comment From: oranagra
yes. certainly!
without that, as soon as redis eats 50% of your RAM, you're not able to do fork (which uses s CopyOnWrite mechanism, not really allocating more physical RAM).
see the warning you get in the log at startup.
Comment From: paviad
I enabled it just now using sysctl vm.overcommit_memory=1 and added it to the redis startup script (adding it to sysctl.conf has no effect in WSL on windows)
Comment From: pnthai88
These are 3 commands i added before start redis server. You should try it
sysctl vm.overcommit_memory=1
sysctl vm.max_map_count=262144
echo never > /sys/kernel/mm/transparent_hugepage/enabled
Tried with 8 millions keys with no problem
Comment From: paviad
sysctl vm.max_map_count=262144echo never > /sys/kernel/mm/transparent_hugepage/enabled
These 2 don't exist on WSL 1 on Windows tho, just a heads up for whoever is watching
Comment From: paviad
I just got this:
111:M 19 Oct 2021 18:08:28.029 # Background saving terminated by signal 11
No additional dump or messages, and it recovered and continued saving successfully later.
Comment From: oranagra
That's it? no crash log with stack trace?
Comment From: paviad
Here's the context:
111:M 19 Oct 2021 18:01:29.043 * 10 changes in 300 seconds. Saving...
111:M 19 Oct 2021 18:01:29.984 * Background saving started by pid 578
578:C 19 Oct 2021 18:03:07.129 * DB saved on disk
111:M 19 Oct 2021 18:03:07.712 * Background saving terminated with success
111:M 19 Oct 2021 18:08:08.058 * 10 changes in 300 seconds. Saving...
111:M 19 Oct 2021 18:08:08.922 * Background saving started by pid 579
111:M 19 Oct 2021 18:08:28.029 # Background saving terminated by signal 11
111:M 19 Oct 2021 18:08:28.135 * 10 changes in 300 seconds. Saving...
111:M 19 Oct 2021 18:08:28.925 * Background saving started by pid 581
581:C 19 Oct 2021 18:10:11.807 * DB saved on disk
111:M 19 Oct 2021 18:10:12.337 * Background saving terminated with success
111:M 19 Oct 2021 18:15:13.030 * 10 changes in 300 seconds. Saving...
111:M 19 Oct 2021 18:15:13.900 * Background saving started by pid 582
582:C 19 Oct 2021 18:16:55.535 * DB saved on disk
111:M 19 Oct 2021 18:16:56.029 * Background saving terminated with success
Comment From: oranagra
Ohh, that's what you wrote.. (didn't notice the bottom part). Well I have no explanation on how that could happen. That message is printed by the parent process when it detects the child exited. If the child would be killed by signal 9, then this is understandable, but for signal 11 the signal handler should have caught it, same as it did in your opening post.
Comment From: paviad
I think I was careless with my memory management, I just noticed that w3wp.exe was taking up quite a lot of RAM and the physical memory utilization hit 100%, at that point I got a few "Invalid argument" errors. I changed the RAM limit of that process and will keep you posted.
Comment From: paviad
No problems in past 18 hours.
I think we can identify two problems that had a part in this issue:
- The RDB was corrupt in some way and might have caused Redis to corrupt memory when unzipping - clearing the RDB and repopulating it from scratch resolved all the hard crashes.
- Memory consumption on the machine as a whole was not managed properly. Need to make sure
vm.overcommit_memory=1as well as to make sure memory consumption does not top out at 100% on the machine while Redis is trying to save in the background (or at all for that matter)
UPDATE 16 more hours and still absolutely no issues - I think we can close this.
Comment From: madolson
Sounds good!