=== REDIS BUG REPORT START: Cut & paste starting from here ===
1846:M 10 Jan 02:25:56.897 # === ASSERTION FAILED ===
1846:M 10 Jan 02:25:56.897 # ==> ziplist.c:411 'NULL' is not true
1846:M 10 Jan 02:25:56.897 # (forcing SIGSEGV to print the bug report.)
1846:M 10 Jan 02:25:56.897 # Redis 3.2.6 crashed by signal: 11
1846:M 10 Jan 02:25:56.897 # Crashed running the instuction at: 0x45cb1a
1846:M 10 Jan 02:25:56.897 # Accessing address: 0xffffffffffffffff
1846:M 10 Jan 02:25:56.897 # Failed assertion: NULL (ziplist.c:411)
------ STACK TRACE ------
EIP:
/usr/local/bin/redis-server *:6502 [cluster](_serverAssert+0x6a)[0x45cb1a]
Backtrace:
/usr/local/bin/redis-server *:6502 [cluster](logStackTrace+0x29)[0x45e7c9]
/usr/local/bin/redis-server *:6502 [cluster](sigsegvHandler+0xac)[0x45eecc]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f78d6c29330]
/usr/local/bin/redis-server *:6502 [cluster](_serverAssert+0x6a)[0x45cb1a]
/usr/local/bin/redis-server *:6502 [cluster][0x43183f]
/usr/local/bin/redis-server *:6502 [cluster](ziplistFind+0x24b)[0x4328db]
/usr/local/bin/redis-server *:6502 [cluster](hashTypeGetFromZiplist+0x90)[0x44ff40]
/usr/local/bin/redis-server *:6502 [cluster][0x45010c]
/usr/local/bin/redis-server *:6502 [cluster](hmgetCommand+0x72)[0x4515a2]
/usr/local/bin/redis-server *:6502 [cluster](call+0x85)[0x4270b5]
/usr/local/bin/redis-server *:6502 [cluster](processCommand+0x367)[0x42a1e7]
/usr/local/bin/redis-server *:6502 [cluster](processInputBuffer+0x105)[0x436e15]
/usr/local/bin/redis-server *:6502 [cluster](aeProcessEvents+0x218)[0x421488]
/usr/local/bin/redis-server *:6502 [cluster](aeMain+0x2b)[0x42173b]
/usr/local/bin/redis-server *:6502 [cluster](main+0x410)[0x41e6f0]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f78d6872f45]
/usr/local/bin/redis-server *:6502 [cluster][0x41e962]
------ INFO OUTPUT ------
# Server
redis_version:3.2.6
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:3786d918480e6d9c
redis_mode:cluster
os:Linux 4.4.0-34-generic x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.4
process_id:1846
run_id:9106514b69f19ec8a489b16fbc7c2a79a4a10caa
tcp_port:6502
uptime_in_seconds:651744
uptime_in_days:7
hz:10
lru_clock:7638148
executable:/usr/local/bin/redis-server
config_file:/etc/redis/redis.conf
# Clients
connected_clients:3
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
# Memory
used_memory:17997969880
used_memory_human:16.76G
used_memory_rss:19928453120
used_memory_rss_human:18.56G
used_memory_peak:18002086784
used_memory_peak_human:16.77G
total_system_memory:25282265088
total_system_memory_human:23.55G
used_memory_lua:37888
used_memory_lua_human:37.00K
maxmemory:18000000000
maxmemory_human:16.76G
maxmemory_policy:allkeys-lru
mem_fragmentation_ratio:1.11
mem_allocator:jemalloc-4.0.3
# Persistence
loading:0
rdb_changes_since_last_save:86471896
rdb_bgsave_in_progress:0
rdb_last_save_time:1483625502
rdb_last_bgsave_status:err
rdb_last_bgsave_time_sec:83
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
# Stats
total_connections_received:3
total_commands_processed:5255265
instantaneous_ops_per_sec:141
total_net_input_bytes:38725434125
total_net_output_bytes:133706554703
instantaneous_input_kbps:256.12
instantaneous_output_kbps:3990.56
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
evicted_keys:157814
keyspace_hits:2994957
keyspace_misses:187616
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
migrate_cached_sockets:0
# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:11699.52
used_cpu_user:23734.44
used_cpu_sys_children:14845.52
used_cpu_user_children:46542.33
# Commandstats
cmdstat_hmset:calls=1965461,usec=1227848193,usec_per_call=624.71
cmdstat_hmget:calls=3182572,usec=3912851526,usec_per_call=1229.46
cmdstat_ping:calls=3,usec=5,usec_per_call=1.67
cmdstat_info:calls=107225,usec=3398055,usec_per_call=31.69
cmdstat_config:calls=1,usec=17,usec_per_call=17.00
cmdstat_cluster:calls=1,usec=185,usec_per_call=185.00
cmdstat_client:calls=2,usec=22,usec_per_call=11.00
# Cluster
cluster_enabled:1
# Keyspace
db0:keys=35974,expires=0,avg_ttl=0
hash_init_value: 1483657279
------ CLIENT LIST OUTPUT ------
id=554 addr=10.172.2.8:49208 fd=15 name= age=40701 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=info
id=558 addr=10.70.165.164:47558 fd=12 name= age=40534 idle=40534 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping
id=559 addr=10.70.165.164:47560 fd=14 name= age=40534 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=5 oll=0 omem=0 events=r cmd=hmget
------ CURRENT CLIENT INFO ------
id=559 addr=10.70.165.164:47560 fd=14 name= age=40534 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=5 oll=0 omem=0 events=r cmd=hmget
argv[0]: 'HMGET'
argv[1]: 'hs:170110|170117|s|ec|2|op_de-s_demotr-s_de'
argv[2]: '236034'
argv[3]: '485253'
argv[4]: '833800'
argv[5]: '374146'
argv[6]: '544001'
argv[7]: '872071'
argv[8]: '486157'
argv[9]: '363151'
argv[10]: '331529'
argv[11]: '331531'
argv[12]: '235916'
argv[13]: '331530'
argv[14]: '236044'
argv[15]: '236307'
argv[16]: '331541'
argv[17]: '374295'
argv[18]: '236181'
argv[19]: '236315'
argv[20]: '344733'
argv[21]: '236446'
argv[22]: '235934'
argv[23]: '830355'
argv[24]: '344615'
argv[25]: '236198'
argv[26]: '235940'
argv[27]: '236079'
argv[28]: '235949'
argv[29]: '296117'
argv[30]: '363446'
argv[31]: '344496'
argv[32]: '836025'
argv[33]: '236340'
argv[34]: '435633'
argv[35]: '235962'
argv[36]: '490686'
argv[37]: '483899'
argv[38]: '236094'
argv[39]: '376517'
argv[40]: '355911'
argv[41]: '354119'
argv[42]: '907845'
argv[43]: '490190'
argv[44]: '344655'
argv[45]: '458827'
argv[46]: '361936'
argv[47]: '235988'
argv[48]: '354654'
argv[49]: '380382'
argv[50]: '236383'
argv[51]: '363226'
argv[52]: '235996'
argv[53]: '379364'
argv[54]: '490853'
argv[55]: '236007'
argv[56]: '354656'
argv[57]: '236393'
argv[58]: '360430'
argv[59]: '331502'
argv[60]: '236398'
argv[61]: '331499'
argv[62]: '819938'
argv[63]: '780667'
argv[64]: '378742'
argv[65]: '331894'
argv[66]: '458355'
argv[67]: '378736'
argv[68]: '358387'
argv[69]: '362483'
argv[70]: '236020'
argv[71]: '331890'
argv[72]: '788341'
argv[73]: '236024'
argv[74]: '236408'
argv[75]: '312825'
1846:M 10 Jan 02:25:56.900 # key 'hs:170110|170117|s|ec|2|op_de-s_demotr-s_de' found in DB containing the following object:
1846:M 10 Jan 02:25:56.900 # Object type: 4
1846:M 10 Jan 02:25:56.900 # Object encoding: 5
1846:M 10 Jan 02:25:56.900 # Object refcount: 1
1846:M 10 Jan 02:25:56.900 # Hash size: 2764
------ REGISTERS ------
1846:M 10 Jan 02:25:56.900 #
RAX:0000000000000000 RBX:000000000000019b
RCX:00000000fbad000c RDX:0000000000000000
RDI:00007f78d6c0f760 RSI:0000000000000000
RBP:00000000004df8e8 RSP:00007fff0c191e80
R8 :00000000021d87a0 R9 :00007f78d6c0f7b8
R10:00007f78d6c0f7b8 R11:00007f78d6c0f7b0
R12:00000000004dfa97 R13:0000000000000000
R14:00007f73e2a302e4 R15:00000000000000ef
RIP:000000000045cb1a EFL:0000000000010202
CSGSFS:0000000000000033
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e8f) -> 00007fff0c191f5c
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e8e) -> 00007fff0c191f60
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e8d) -> 00007f73e2a00000
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e8c) -> 0000000a0d34372a
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e8b) -> 0000000000039a02
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e8a) -> f000000000000010
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e89) -> 00007f7459787373
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e88) -> 0000000557213a00
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e87) -> 00000000004328db
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e86) -> 0000000000000001
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e85) -> 055bec00e2a0000a
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e84) -> 00007f7457213a00
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e83) -> 000000000043183f
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e82) -> 0000000000000006
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e81) -> 0000000000000001
1846:M 10 Jan 02:25:56.900 # (00007fff0c191e80) -> 00007f73e2a302e2
------ FAST MEMORY TEST ------
1846:M 10 Jan 02:25:56.901 # Bio thread for job type #0 terminated
1846:M 10 Jan 02:25:56.901 # Bio thread for job type #1 terminated
*** Preparing to test memory region 724000 (94208 bytes)
*** Preparing to test memory region 21cb000 (135168 bytes)
*** Preparing to test memory region 7f7288600000 (27051163648 bytes)
*** Preparing to test memory region 7f78d4dff000 (8388608 bytes)
*** Preparing to test memory region 7f78d5600000 (14680064 bytes)
*** Preparing to test memory region 7f78d6400000 (4194304 bytes)
*** Preparing to test memory region 7f78d6c11000 (20480 bytes)
*** Preparing to test memory region 7f78d6e33000 (16384 bytes)
*** Preparing to test memory region 7f78d755f000 (8192 bytes)
*** Preparing to test memory region 7f78d7569000 (4096 bytes)
*** Preparing to test memory region 7f78d756a000 (4096 bytes)
*** Preparing to test memory region 7f78d756d000 (8192 bytes)
*** Preparing to test memory region 7f78d756f000 (8192 bytes)
0.0
Comment From: antirez
Hello, please can you tell me the value of the configuration parameter hash-max-ziplist-entries? Thanks.
Comment From: fgaule
Sure.
hash-max-ziplist-entries = 20000 hash-max-ziplist-value = 1024
Comment From: antirez
Thanks, please before going in depth with debugging of the crash report, could you answer the following?
- Do you run in a cloud environment on in your own machines?
- If you are using your hardware, are you using error corrected memory modules?
- If you are using your hardware and your RAM is not error corrected, could you run memtest86 in this box for some time?
Thanks.
Comment From: fgaule
-
Do you run in a cloud environment on in your own machines? Openstack cloud environment (3 nodes)
-
If you are using your hardware, are you using error corrected memory modules? I can find this answer with the devops team if you need.
-
If you are using your hardware and your RAM is not error corrected, could you run memtest86 in this box for some time? Not possible :(
Comment From: antirez
Could you kindly ask your Cloud vendor if they use ECC memory? I'm starting anyway an investigation since this is the second bug report of a similar type we receive about ziplists. Even if the ziplist code did not changed significantly later it is possible that there are uncovered bugs. Thanks.
Comment From: antirez
Sorry another question: I see that in your hashes, you stored numerical keys. Are the values also numbers, or there are also strings? What are the numbers ranges if they are numbers? Thanks.
Comment From: fgaule
No problem. In that case, numbers are stored as string and they its range is 200.000 to 999.000, expecting a maximum of 20.000 per key. (Before you think this is madness, we are bench-marking a a key design idea which would match our use case).
Feel free to ask me whatever you need ;)
Comment From: antirez
Thanks! I'm going to write a stress test for ziplists in order to try to replicate similar conditions. Normally I wrote ziplists stress testers targeting the Redis API. This time I'll write a C program that uses ziplist.c directly in order to test many operations per second and explore more states than otherwise possible. Btw modeling problems with hashes composed of numbers is a good very common design! Just a suggestion, once you use ziplists of, for example, 1000-3000 entires each, the key overhead is going to be small, very similar to using 20k items per ziplist. However 20k items make operations pretty slow, so I suggest also trying with a lower number of items per ziplist.
Comment From: fgaule
"Could you kindly ask your Cloud vendor if they use ECC memory? " I dont know if it is a good o bad news but they are currently using 'Advanced ECC'
PD: Thanks for the advice, benchmarks shows that having that big number of hash fields was pretty slow in terms of performance and cpu burner. I'm going to split my keys in buckets having 1000-3000 entries to check it out!
Comment From: antirez
Thanks @fgaule the fact they run memory corrected modules makes more likely it's a Redis bug, which in theory is bad, but actually makes it more clear where to look, which is better :-) So I'm happy to hear that. It means that the efforts in reproducing this issue are well spent. About your use case, please feel free to comment here if you need any additional hint on ziplists tradeoffs or other ways to model your problem. Bug reports and fast follow-ups deserve this and more :-) Thanks.
Comment From: fgaule
Thanks @antirez, i appriciate!
Comment From: fgaule
Just in case, virtual machines are running over "HP ProLiant BL465c Gen8" hardware.
Comment From: antirez
Thank you, cool that the provider is providing assistance and details btw.
Comment From: antirez
Hello again, quick update: the investigation is in progress. So far nothing found but this activity will take a few days... Please in case the crash happens again, a note here would be extremely useful. Thanks!
Comment From: antirez
Dear @fgaule, after your bug report I started an activity, which is taking days and is ongoing since the original bug report, in order to audit the ziplist.c file. Today I finally reached the first interesting result, analytically finding a bug that the stress testers I wrote could not find. The bug can be reproduced via the Redis API using List type commands. It is not yet clear if Hash commands could trigger the bug. Even I'm still in the middle of the investigation I wanted to ping you about that, since this bug corrupting the ziplist is an evidence that there are issues even if they belong to states that are very hard to trigger. However this makes a lot less likely that you ran into an actual bug. AFAIK the bug I found, that may be or not the same that crashed your server (but potentially yes, or very related), is there since the introduction of ziplist.c, so is a 5 years old bug more or less. I'll update this issue once I've further details. Thanks.
Comment From: fgaule
@antirez good to here you haven't sleep looking for this bug :smile: I can confirm i haven't been playing around with List commands, so it is possible that Hash commands also reproduce the bug. I have checked my code the list of commands are: - hget - hgetall - hmget - hset - hmset
Ping me if you need more details. Thanks!!
Comment From: antirez
Thanks Federico :-)
Comment From: antirez
Dear @fgaule I wonder if you could attach the Redis server binary that caused the crash here renamed as PNG or jpg. Thanks!
Comment From: fgaule
@antirez here we go !
Its a tar.gz renamed as jpg
Comment From: antirez
Cool thank you! CC: @oranagra
Comment From: antirez
Sorry to bother you again @fgaule I was interested in the actual binary file after the compilation, because if we compile it ourselves, the offsets in the crash report do not match the ones of your specific GCC/configuration. So if you could please send use the redis-server binary you were using, or at least one compiled with the exact system, that would be very useful. Thanks.
Comment From: fgaule
@antirez sorry to misunderstand you, here is the redis-server file
Comment From: antirez
Thanks!
Comment From: antirez
Good monday @fgaule assuming there is such a thing ;-)
We are near the end of our investigation, we found a bug but not apparently the one that made your instance crashing, however we are doing improvements to ziplist.c crash logs and upgrading other components in Redis such as Jemalloc in order to put us in a defensive position. In recent years, we had something like 4 or 5 crash reports signaled about ziplists. A few can were tracked to a quicklist bug that was now fixed. Another strange one is probably due to the bug we found during this investigation. The bug you experimented and another one have no explanation, however the other user did not used memory corrected modules, and a very large amount of crash reports in Redis when no memory corrected modules are used, are just memory errors.
In order to make sure we investigated everything was possible to investigate, just to verify I understood correctly, I want to ask if you used, in all your tests, numbers both as keys and values, and if the range of both keys and values numbers are the ones you reported earlier. We also tested different kind of encodings and values, large and small, during this investigation, but we put a special focus on key-value both composed of numbers. If instead the values were different and just similar to the keys, please could you describe the layout of your values? We would like to do a few more simulations and using the specific layout of your values.
Thank you again!
Comment From: alon-redis
@fgaule - we are still investigating the issue.
I saw that RDB snapshot failed with "err" on your environment.
# Persistence
loading:0
rdb_changes_since_last_save:86471896
rdb_bgsave_in_progress:0
rdb_last_save_time:1483625502
rdb_last_bgsave_status:err
rdb_last_bgsave_time_sec:83
rdb_current_bgsave_time_sec:-1
can you provide us more details why it failed? as it may provide us some hint on the crash
Comment From: antirez
@alon-redis @fgaule this is a good question. One way the ziplist could get corrupted is by disk indeed. However the RDB files have a CRC checksum of 64 bit. Such a check sum can be disabled by the user via redis.conf, I did not thought of asking about this, but maybe can be helpful. The option is:
rdbchecksum yes
On a related note, @fgaule was testing Redis Cluster, so it is also possible he was MIGRATE-ing keys via redis-trib or any other mean, so the key here could be moving from a different instance to that one. However MIGRATE also features the same CRC64 checksum of RDB files.
Comment From: fgaule
@alon-redis Sadly i have no access to redis log cause the node (machine) where it had failed doesn't exist anymore. But i remember there was a time when redis could not make its snapshot cause it run out of disk space. By mistake i had configured maxmemory to be 85% of all the RAM and had forgotten to stop persistent :cry: @antirez i did not migrate keys using redis-trib or using any other command.
Hope it helps, its all i have right now.
Comment From: antirez
@fgaule thank you a lot. If you could acknowledge also the values were numerical would be great. I promise this is the last question :-) We are near to the end of the investigation.
Comment From: fgaule
@antirez no problem, i want this resolved as much as you, so you can ask me whatever you need. There was 2 benchmarks, first one saved an "" and other saved a 1kb string. I'm pretty sure the first benchmark was the one associated with the crash
Comment From: antirez
@fgaule oh! That's very important. I the models I used I've empty strings, but they are used from time to time. Perhaps there is an odd edge condition that can be triggered only using all empty values. I'll try to both analytically investigate this and to modify the fuzzing to just use empty values. Thanks.
Comment From: stutiredboy
=== REDIS BUG REPORT START: Cut & paste starting from here ===
10357:C 29 Jun 16:52:39.404 # === ASSERTION FAILED ===
10357:C 29 Jun 16:52:39.404 # ==> ziplist.c:411 'NULL' is not true
10357:C 29 Jun 16:52:39.404 # (forcing SIGSEGV to print the bug report.)
10357:C 29 Jun 16:52:39.404 # Redis 3.2.12 crashed by signal: 11
10357:C 29 Jun 16:52:39.404 # Crashed running the instuction at: 0x565169ab4cdb
10357:C 29 Jun 16:52:39.404 # Accessing address: 0xffffffffffffffff
10357:C 29 Jun 16:52:39.404 # Failed assertion: NULL (ziplist.c:411)
------ STACK TRACE ------
EIP:
redis-aof-rewrite 127.0.0.1:9736(_serverAssert+0x6b)[0x565169ab4cdb]
Backtrace:
redis-aof-rewrite 127.0.0.1:9736(logStackTrace+0x32)[0x565169ab6932]
redis-aof-rewrite 127.0.0.1:9736(sigsegvHandler+0x9e)[0x565169ab700e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7fd40f127730]
redis-aof-rewrite 127.0.0.1:9736(_serverAssert+0x6b)[0x565169ab4cdb]
redis-aof-rewrite 127.0.0.1:9736(zipLoadInteger+0x9b)[0x565169a87d6b]
redis-aof-rewrite 127.0.0.1:9736(ziplistGet+0x97)[0x565169a88af7]
redis-aof-rewrite 127.0.0.1:9736(quicklistNext+0x17b)[0x565169a75fcb]
redis-aof-rewrite 127.0.0.1:9736(rewriteListObject+0x8f)[0x565169ab131f]
redis-aof-rewrite 127.0.0.1:9736(rewriteAppendOnlyFile+0x5c8)[0x565169ab2088]
redis-aof-rewrite 127.0.0.1:9736(rewriteAppendOnlyFileBackground+0x163)[0x565169ab2673]
redis-aof-rewrite 127.0.0.1:9736(bgrewriteaofCommand+0x49)[0x565169ab28d9]
redis-aof-rewrite 127.0.0.1:9736(call+0x88)[0x565169a7ce18]
redis-aof-rewrite 127.0.0.1:9736(processCommand+0x3ce)[0x565169a8020e]
redis-aof-rewrite 127.0.0.1:9736(processInputBuffer+0x115)[0x565169a8d575]
redis-aof-rewrite 127.0.0.1:9736(aeProcessEvents+0xf8)[0x565169a76f68]
redis-aof-rewrite 127.0.0.1:9736(aeMain+0x2b)[0x565169a7732b]
redis-aof-rewrite 127.0.0.1:9736(main+0x259)[0x565169a74109]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb)[0x7fd40ef7809b]
redis-aof-rewrite 127.0.0.1:9736(_start+0x2a)[0x565169a7451a]
------ INFO OUTPUT ------
# Server
redis_version:3.2.12
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:bb18e4ca494949b6
redis_mode:standalone
os:Linux 4.19.0-6-amd64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:8.3.0
process_id:10357
run_id:e8c6956efba726a84eac8145a51f33d50bd86146
tcp_port:9736
uptime_in_seconds:8
uptime_in_days:0
hz:10
lru_clock:16362455
executable:/home/tiredboy/redis-3.2.12/src/redis-server
config_file:/home/tiredboy/conf/redis.conf
# Clients
connected_clients:1
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
# Memory
used_memory:565669536
used_memory_human:539.46M
used_memory_rss:588742656
used_memory_rss_human:561.47M
used_memory_peak:565669536
used_memory_peak_human:539.46M
total_system_memory:8365965312
total_system_memory_human:7.79G
used_memory_lua:37888
used_memory_lua_human:37.00K
maxmemory:4294967296
maxmemory_human:4.00G
maxmemory_policy:noeviction
mem_fragmentation_ratio:1.04
mem_allocator:jemalloc-4.0.3
# Persistence
loading:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1593420751
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
# Stats
total_connections_received:1
total_commands_processed:1
instantaneous_ops_per_sec:0
total_net_input_bytes:40
total_net_output_bytes:9928
instantaneous_input_kbps:0.00
instantaneous_output_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
migrate_cached_sockets:0
# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:67108864
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:0.01
used_cpu_user:0.02
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
# Commandstats
cmdstat_command:calls=1,usec=592,usec_per_call=592.00
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=8,expires=0,avg_ttl=0
db1:keys=488146,expires=0,avg_ttl=0
db11:keys=2778,expires=2,avg_ttl=136781350303
db12:keys=2643,expires=0,avg_ttl=0
db13:keys=294,expires=0,avg_ttl=0
hash_init_value: 1592826393
------ CLIENT LIST OUTPUT ------
id=2 addr=127.0.0.1:50260 fd=5 name= age=4 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=bgrewriteaof
------ CURRENT CLIENT INFO ------
id=2 addr=127.0.0.1:50260 fd=5 name= age=4 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=bgrewriteaof
argv[0]: 'bgrewriteaof'
10353:M 29 Jun 16:52:39.438 # Background AOF rewrite terminated by signal 11
we also run into this issue, the values in list are strings(dumps from json), such as: {"uid": 1234, "credit": 5678}
Redis Version: 3.2.1 to 3.2.13/4.0.14 Others: 1. bgsave can be successfully finished. 2. we have try different machines(use the same dump.rdb to load data at first start)
Comment From: antirez
Hello @stutiredboy, we no longer received in any new version of Redis a similar bug report, and we can't investigate bugs related to Redis 3. Please upgrade to Redis 6, that is compatible with Redis 3, so that you can run new code that is hopefully a lot saner :) Cheers.
Comment From: stutiredboy
thanks @antirez . after we upgrade to redis 5.0, everything works fine.