Discovered during investigation of issue 13205
Crash report
Paste the complete crash log between the quotes below. Please include a few lines from the log preceding the crash report to provide some context.
=== REDIS BUG REPORT START: Cut & paste starting from here ===
19766:M 29 May 2024 14:29:44.134 # Redis 7.2.3 crashed by signal: 11, si_code: 1
19766:M 29 May 2024 14:29:44.134 # Accessing address: 0x48
19766:M 29 May 2024 14:29:44.134 # Crashed running the instruction at: 0x48a9bc
------ STACK TRACE ------
EIP:
redis-server *:63791 [cluster](defragLaterStep+0x4c)[0x48a9bc]
Backtrace:
/lib64/libpthread.so.0(+0x12cf0)[0x7f11c991fcf0]
redis-server *:63791 [cluster](defragLaterStep+0x4c)[0x48a9bc]
redis-server *:63791 [cluster](activeDefragCycle+0x3b4)[0x48b0f4]
redis-server *:63791 [cluster](databasesCron+0x6c)[0x5668ec]
redis-server *:63791 [cluster](serverCron+0x64a)[0x5691da]
redis-server *:63791 [cluster][0x56248d]
redis-server *:63791 [cluster](aeMain+0x1d8)[0x563c58]
redis-server *:63791 [cluster](main+0x39a)[0x450d4a]
/lib64/libc.so.6(__libc_start_main+0xe5)[0x7f11c9582d85]
redis-server *:63791 [cluster](_start+0x2e)[0x45147e]
------ REGISTERS ------
19766:M 29 May 2024 14:29:44.135 #
RAX:0000000000000000 RBX:0000000000000000
RCX:0000000000000000 RDX:0000000000000000
RDI:0000000000000000 RSI:0000000000000000
RBP:0000000000000000 RSP:00007ffe5c2adab0
R8 :000000000036c9c2 R9 :00007ffe5c34a080
R10:00007ffe5c2adb30 R11:0000000000000002
R12:00000000000002c0 R13:0006199bef354748
R14:0000000000000000 R15:0000000000000000
RIP:000000000048a9bc EFL:0000000000010246
CSGSFS:002b000000000033
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adabf) -> 0000000000000000
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adabe) -> 0006199bef354748
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adabd) -> 0000000000000000
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adabc) -> 0000000000000014
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adabb) -> 0000000000020d20
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adaba) -> 0000000066577418
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab9) -> 0000000000481daf
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab8) -> 00000000000002c0
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab7) -> 0000000000000000
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab6) -> 000000050a98c3d4
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab5) -> 0000000000054e20
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab4) -> 000000050a98c3d4
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab3) -> 00000000004809c0
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab2) -> 00000000004808d0
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab1) -> 0000000000000001
19766:M 29 May 2024 14:29:44.135 # (00007ffe5c2adab0) -> 0000000000000000
...
------ DUMPING CODE AROUND EIP ------
Symbol: defragLaterStep (base: 0x48a970)
Module: redis-server *:63791 [cluster] (base 0x400000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x48a970 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
19766:M 29 May 2024 14:29:44.272 # dump of function (hexdump of 204 bytes):
41574531ff41564989fe41554989f5415455534883ec68488b2d92734600f30f7e0d127c19004c8b25a3734600488b15844c46000f160d657b19000f294c2410488b35794c46004885d2755a498b7e48488b074885f6742f483970100f85510300004889c6e8d6870d00498b464848c7053f4c46000000000048c7053c4c460000000000488b004885c00f841c03000048c7051d4c460000000000488b70104889351a4c4600498b3ee852d00d004889c3488b05f87246004889442420e9b500000066662e0f1f8400000000
Function at 0x5631b0 is listDelNode
Function at 0x567a70 is dictFind
Additional information
- Running in cluster mode
- loaded up with simple string keys with random (short) TTLs
- while defrag is running, CONFIG SET activedefrag off; CONFIG SET activedefrag on
fix is to reset expires_counter during the disabled mid-run if block in defrag.c:activeDefragCycle()
Comment From: stevelipinski
Fix: https://github.com/stevelipinski/redis/commit/ecb7cd826022f5177f47474509c8ca23746d889c
Comment From: sundb
@stevelipinski thanks, you can create a PR instead of issue to fix it.
Comment From: sundb
Fixed via #13315