Redis-server keeps crashing and restarting each 1-2 seconds

=== REDIS BUG REPORT START: Cut & paste starting from here ===
30924:M 13 Sep 15:09:53.635 # === ASSERTION FAILED OBJECT CONTEXT ===
30924:M 13 Sep 15:09:53.635 # Object type: 0
30924:M 13 Sep 15:09:53.635 # Object encoding: 0
30924:M 13 Sep 15:09:53.635 # Object refcount: 1
30924:M 13 Sep 15:09:53.635 # Object raw string len: 50
30924:M 13 Sep 15:09:53.635 # Object raw string content: "algo_perf_display_cc74e5df7c34305933999bdb45dd4369"
30924:M 13 Sep 15:09:53.635 # === ASSERTION FAILED ===
30924:M 13 Sep 15:09:53.635 # ==> db.c:161 'retval == DICT_OK' is not true
30924:M 13 Sep 15:09:53.635 # (forcing SIGSEGV to print the bug report.)
30924:M 13 Sep 15:09:53.635 # Redis 3.2.6 crashed by signal: 11
30924:M 13 Sep 15:09:53.635 # Crashed running the instuction at: 0x55f5c883c550
30924:M 13 Sep 15:09:53.635 # Accessing address: 0xffffffffffffffff
30924:M 13 Sep 15:09:53.635 # Failed assertion: retval == DICT_OK (db.c:161)

------ STACK TRACE ------
EIP:
/usr/bin/redis-server 127.0.0.1:6379(_serverAssert+0x70)[0x55f5c883c550]

Backtrace:
/usr/bin/redis-server 127.0.0.1:6379(logStackTrace+0x4a)[0x55f5c883e3da]
/usr/bin/redis-server 127.0.0.1:6379(sigsegvHandler+0xc3)[0x55f5c883eba3]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x110e0)[0x7f0082e2d0e0]
/usr/bin/redis-server 127.0.0.1:6379(_serverAssert+0x70)[0x55f5c883c550]
/usr/bin/redis-server 127.0.0.1:6379(dbAdd+0x83)[0x55f5c8817493]
/usr/bin/redis-server 127.0.0.1:6379(rdbLoad+0x216)[0x55f5c8821486]
/usr/bin/redis-server 127.0.0.1:6379(loadDataFromDisk+0x45)[0x55f5c8807025]
/usr/bin/redis-server 127.0.0.1:6379(main+0x3a9)[0x55f5c87f9709]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f0082a9d2e1]
/usr/bin/redis-server 127.0.0.1:6379(_start+0x2a)[0x55f5c87f995a]

------ INFO OUTPUT ------
# Server
redis_version:3.2.6
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:a2cab475abc2115e
redis_mode:standalone
os:Linux 4.9.0-9-amd64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:6.3.0
process_id:30924
run_id:f0ed88a28c25ff34a8d1e37bf7004bfc619b9bdf
tcp_port:6379
uptime_in_seconds:1
uptime_in_days:0
hz:10
lru_clock:6167072
executable:/usr/bin/redis-server
config_file:/etc/redis/redis.conf

# Clients
connected_clients:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0

# Memory
used_memory:411201800
used_memory_human:392.15M
used_memory_rss:0
used_memory_rss_human:0B
used_memory_peak:411201800
used_memory_peak_human:392.15M
total_system_memory:67338059776
total_system_memory_human:62.71G
used_memory_lua:37888
used_memory_lua_human:37.00K
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
mem_fragmentation_ratio:0.00
mem_allocator:jemalloc-3.6.0

# Persistence
loading:1
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1600002592
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
loading_start_time:1600002592
loading_total_bytes:767186458
loading_loaded_bytes:205520736
loading_loaded_perc:26.79
loading_eta_seconds:2

# Stats
total_connections_received:0
total_commands_processed:0
instantaneous_ops_per_sec:0
total_net_input_bytes:0
total_net_output_bytes:0
instantaneous_input_kbps:0.00
instantaneous_output_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
migrate_cached_sockets:0

# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

# CPU
used_cpu_sys:0.15
used_cpu_user:1.36
used_cpu_sys_children:0.00
used_cpu_user_children:0.00

# Commandstats

# Cluster
cluster_enabled:0

# Keyspace
db0:keys=1084207,expires=0,avg_ttl=0
hash_init_value: 1600117179

------ CLIENT LIST OUTPUT ------

------ REGISTERS ------
30924:M 13 Sep 15:09:53.636 # 
RAX:0000000000000000 RBX:00000000000000a1
RCX:00007f008326f3d0 RDX:0000000000000002
RDI:00007f0082006000 RSI:000000000000003d
RBP:000055f5c888e545 RSP:00007ffd8c675f00
R8 :00007f008398fec0 R9 :00007f008398fec0
R10:0000000000000073 R11:0000000000000206
R12:000055f5c888c4fc R13:00007f0082023000
R14:0000000000000000 R15:00007f00820e4f90
RIP:000055f5c883c550 EFL:0000000000010206
CSGSFS:002b000000000033
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f0f) -> 000055f5c8800200
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f0e) -> 00000000fffffffe
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f0d) -> 0000000000000007
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f0c) -> 000000008200e240
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f0b) -> 00007f0082043200
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f0a) -> 00007ffd8c675fa0
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f09) -> 0000017487960d73
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f08) -> 000000000000000a
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f07) -> 000055f5c8821486
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f06) -> ffffffffffffffff
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f05) -> 00007ffd8c675f78
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f04) -> 00007ffd8c675f80
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f03) -> 000055f5c8817493
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f02) -> 00007f0082023000
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f01) -> 00007f00820e4f90
30924:M 13 Sep 15:09:53.636 # (00007ffd8c675f00) -> 00007f00678f9620

------ FAST MEMORY TEST ------
30924:M 13 Sep 15:09:53.636 # Bio thread for job type #0 terminated
30924:M 13 Sep 15:09:53.636 # Bio thread for job type #1 terminated
*** Preparing to test memory region 55f5c8ac4000 (86016 bytes)
*** Preparing to test memory region 7f0067800000 (419430400 bytes)
*** Preparing to test memory region 7f0080bb3000 (8388608 bytes)
*** Preparing to test memory region 7f00813b4000 (8388608 bytes)
*** Preparing to test memory region 7f0082000000 (8388608 bytes)
*** Preparing to test memory region 7f0082e18000 (16384 bytes)
*** Preparing to test memory region 7f0083035000 (16384 bytes)
*** Preparing to test memory region 7f008326f000 (4096 bytes)
*** Preparing to test memory region 7f008398f000 (16384 bytes)
*** Preparing to test memory region 7f008399d000 (4096 bytes)
.O.O.O.O.O.O.O.O.O.O
Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.

------ DUMPING CODE AROUND EIP ------
Symbol: _serverAssert (base: 0x55f5c883c4e0)
Module: /usr/bin/redis-server 127.0.0.1:6379 (base 0x55f5c87da000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x55f5c883c4e0 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
30924:M 13 Sep 15:09:54.586 # dump of function (hexdump of 240 bytes):
8b05e681280041544989fc554889f55389d385c07505e885fdffff488d3546610500bf0300000031c0e8023cfcff488d354c6105004d89e089d94889eabf0300000031c0e8e73bfcff488d3520520500bf0300000031c04c89257a81280048892d7b812800891d7d812800e8c03bfcffc60425ffffffff785b5d415cc30f1f00415541544989fc55534881ec980000008b3d5681280064488b042528000000488984248800000031c085ff0f842f010000488d35e851050031c0bf0300000031ed31db4989e5e8653bfcff418b942498000000488d35c260050031c0bf03000000e84a3bfcff418b542408488d35bd60

=== REDIS BUG REPORT END. Make sure to include from START to END. ===

       Please report the crash by opening an issue on github:

           http://github.com/antirez/redis/issues

  Suspect RAM error? Use redis-server --test-memory to verify it.

                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 3.2.6 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in standalone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 31007
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

31007:M 13 Sep 15:09:55.531 # Server started, Redis version 3.2.6
31007:M 13 Sep 15:09:55.531 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
31007:M 13 Sep 15:09:55.531 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.

To reproduce

A restart of the redis-server reproduces the crash.

Expected behavior

The redis-server continues its normal operations of saving key-values without crash.

Additional information

Happened all of a sudden. OS: Debian 9 Memory usage:

root@server /var/log # free -h
              total        used        free      shared  buff/cache   available
Mem:            62G         30G        2.7G        5.0M         29G         31G
Swap:           15G          0B         15G

Dmesg:

[Fri Sep 11 17:47:23 2020] php-fpm7.2[1086]: segfault at 60 ip 00007fccd7e19cda sp 00007ffd649253f0 error 4 in libcrypto.so.1.0.2[7fccd7cc3000+238000]
[Fri Sep 11 18:13:54 2020] php-fpm7.2[18294]: segfault at 19 ip 00007fccdddc6524 sp 00007ffd649252a8 error 4 in libc-2.24.so[7fccddd4b000+195000]
[Fri Sep 11 20:24:31 2020] php[21991]: segfault at 50 ip 00007f2486dba784 sp 00007ffdc32e0e90 error 4 in pdo_mysql.so[7f2486db6000+6000]
[Sun Sep 13 10:01:44 2020] php-fpm7.2[27992]: segfault at 60 ip 00007fccd7e19cda sp 00007ffd649253f0 error 4 in libcrypto.so.1.0.2[7fccd7cc3000+238000]

Redis packages installed:

root@server ~ # dpkg -l | grep redis
ii  php-redis                                  5.0.2+4.3.0-1+0~20190808.13+debian9~1.gbpcc7e06       amd64        PHP extension for interfacing with Redis
ii  redis-server                               3:3.2.6-3+deb9u3                                      amd64        Persistent key-value database with network interface
ii  redis-tools                                3:3.2.6-3+deb9u3                                      amd64        Persistent key-value database with network interface (client)

Is it possible that the bug is related to any external module?

Comment From: oranagra

@ssgoburdhun it crashes while loading an RDB file, probably because there are two keys with the same name. Rename the rdb file (for safe keeping) and redis would start. Do you know how that rdb file was produced? Or how to reproduce it from scratch?

Comment From: ssgoburdhun

Thank you @oranagra for your prompt reply.

Do you know how that rdb file was produced? Or how to reproduce it from scratch?

I am not aware how the rdb file was produced and why there are 2 similar keys. It just happened suddenly today. Is there a guide I can follow to reproduce the issue?

There are 2 commands used to reproduce the binary file:

$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x55f5c883c4e0 -D -b binary -m i386:x86-64 /tmp/dump.bin

Will any of the 2 output be helpful?

Comment From: oranagra

@ssgoburdhun these commands are used to see better understand the crash, but in this case the crash is clear. Maybe you can use redis-rdb-tools to analyze the rdb file and find out the key name that is a duplicate and its type. Then from that try to conclude which commands were used that may have generated it.

Also, maybe you can tell me the history behind that server and it's data. Was it an upgrade from an earlier version?

Comment From: erlandl4g

I have similar problem on Centos 7. And I have it just installed, configured and not even stored a single key in it. Sometimes it allows me correct to redis, but mostly not.

/usr/local/bin/redis-server /etc/redis/6379.conf
[root@email /var/run]# redis-cli -h 10.39.136.205 -p 6379
10.39.136.205:6379> client list
id=3 addr=10.39.136.205:40062 laddr=10.39.136.205:6379 fd=10 name= age=42 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=26 qbuf-free=40928 argv-mem=10 obl=0 oll=0 omem=0 tot-mem=61466 events=r cmd=client user=default redir=-1
id=4 addr=10.39.138.89:36254 laddr=10.39.136.205:6379 fd=11 name= age=27 idle=27 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=command user=default redir=-1
10.39.136.205:6379> quit
[root@email /var/run]# systemctl restart redis
[root@email /var/run]# redis-cli -h 10.39.138.89 -p 6379
Could not connect to Redis at 10.39.138.89:6379: Connection refused
not connected> quit
[root@email /var/run]# systemctl stop redis
[root@email /var/run]# systemctl start redis
[root@email /var/run]# redis-cli -h 10.39.138.89 -p 6379
Could not connect to Redis at 10.39.138.89:6379: Connection refused
38110:C 17 Nov 2021 05:24:40.162 # Configuration loaded
38110:signal-handler (1637144680) Received SIGTERM scheduling shutdown...
38110:S 17 Nov 2021 05:24:40.163 * Increased maximum number of open files to 10032 (it was originally set to 1024).
38110:S 17 Nov 2021 05:24:40.163 * monotonic clock: POSIX clock_gettime
38110:S 17 Nov 2021 05:24:40.163 * Running mode=standalone, port=6379.
38110:S 17 Nov 2021 05:24:40.163 # Server initialized
38110:S 17 Nov 2021 05:24:40.164 * Ready to accept connections
38110:S 17 Nov 2021 05:24:40.164 # User requested shutdown...
38110:S 17 Nov 2021 05:24:40.164 * Calling fsync() on the AOF file.
38110:S 17 Nov 2021 05:24:40.165 * Saving the final RDB snapshot before exiting.
38110:S 17 Nov 2021 05:24:40.166 * DB saved on disk
38110:S 17 Nov 2021 05:24:40.166 * Removing the pid file.
38110:S 17 Nov 2021 05:24:40.166 # Redis is now ready to exit, bye bye...
38210:C 17 Nov 2021 05:24:45.384 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
38210:C 17 Nov 2021 05:24:45.384 # Redis version=6.2.5, bits=64, commit=00000000, modified=0, pid=38210, just started
38210:C 17 Nov 2021 05:24:45.384 # Configuration loaded
38303:C 17 Nov 2021 05:24:50.412 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
38303:C 17 Nov 2021 05:24:50.412 # Redis version=6.2.5, bits=64, commit=00000000, modified=0, pid=38303, just started
38303:C 17 Nov 2021 05:24:50.412 # Configuration loaded
38303:signal-handler (1637144690) Received SIGTERM scheduling shutdown...
38303:S 17 Nov 2021 05:24:50.413 * Increased maximum number of open files to 10032 (it was originally set to 1024).
38303:S 17 Nov 2021 05:24:50.413 * monotonic clock: POSIX clock_gettime
38303:S 17 Nov 2021 05:24:50.413 * Running mode=standalone, port=6379.
38303:S 17 Nov 2021 05:24:50.414 # Server initialized
38303:S 17 Nov 2021 05:24:50.415 * Ready to accept connections
38303:S 17 Nov 2021 05:24:50.415 # User requested shutdown...
38303:S 17 Nov 2021 05:24:50.415 * Calling fsync() on the AOF file.
38303:S 17 Nov 2021 05:24:50.415 * Saving the final RDB snapshot before exiting.
38303:S 17 Nov 2021 05:24:50.416 * DB saved on disk
38303:S 17 Nov 2021 05:24:50.416 * Removing the pid file.
38303:S 17 Nov 2021 05:24:50.416 # Redis is now ready to exit, bye bye...
38418:C 17 Nov 2021 05:24:55.652 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
38491:C 17 Nov 2021 05:25:00.912 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
38491:C 17 Nov 2021 05:25:00.912 # Redis version=6.2.5, bits=64, commit=00000000, modified=0, pid=38491, just started
38491:C 17 Nov 2021 05:25:00.912 # Configuration loaded
38491:signal-handler (1637144700) Received SIGTERM scheduling shutdown...
38491:S 17 Nov 2021 05:25:00.913 * Increased maximum number of open files to 10032 (it was originally set to 1024).
38491:S 17 Nov 2021 05:25:00.913 * monotonic clock: POSIX clock_gettime
38491:S 17 Nov 2021 05:25:00.913 * Running mode=standalone, port=6379.
38491:S 17 Nov 2021 05:25:00.913 # Server initialized
38491:S 17 Nov 2021 05:25:00.914 * Ready to accept connections
38491:S 17 Nov 2021 05:25:00.914 # User requested shutdown...
38491:S 17 Nov 2021 05:25:00.914 * Calling fsync() on the AOF file.
38491:S 17 Nov 2021 05:25:00.914 * Saving the final RDB snapshot before exiting.
38491:S 17 Nov 2021 05:25:00.916 * DB saved on disk
38491:S 17 Nov 2021 05:25:00.916 * Removing the pid file.
38491:S 17 Nov 2021 05:25:00.916 # Redis is now ready to exit, bye bye...

Comment From: oranagra

@erlandl4g it looks like someone / something is instructing redis to shutdown. maybe someone else is connecting to it? did you happen to disable protected-mode?

Comment From: erlandl4g

Found the answer on

https://stackoverflow.com/questions/42079847/determining-why-redis-is-receiving-a-sigterm-every-minute-or-two

Which I would say is a flaw in Redis and should be fixed

Comment From: yossigo

@erlandl4g This is not a flaw, the daemonization mechanism configured for Redis needs to match the one systemd is set to expect. When using systemd one should generally:

  • Make sure Redis is built with libsystemd
  • Use Type=Notify
  • Configure Redis with supervised auto.

Comment From: erlandl4g

Yes, I see it now, but it was not so clear when reading the manual for the first time and following everything line by line.

And I install the simple recommended way and I assume libsystemd is build that way:

cd /opt
wget https://download.redis.io/releases/redis-6.2.6.tar.gz
tar xzf redis-6.2.6.tar.gz
cd redis-6.2.6
make
make install

Yes supervised auto was suggested in the config file and I left it. daemonize was not so well explained. Type=Notify was not mentioned anywhere either. Unless I have missed it. It would be best if redis install would create systemd file as needed (for most cases).

Setup guide could be improved on redis site to show better guidance on simple setup, cluster setup and sentinel setup (so there would be no need to read this on other sites having much better explained setup steps and examples).

And last but not least - Yes, my periodic restarts have stopped after I corrected systemd unit file and config file set to daemonize no and Redis is working fine now.

Thank you.

Comment From: yossigo

@erlandl4g Thanks for your feedback, I'll take a look and see if we can improve that. Generally install from source is always more tricky and I think pulling a pre-built package from a credible repo is often easier.

Comment From: oranagra

please note this issue is not about supervised / daemonize, it is about a corruption / crash. i was about to re-open the issue.. but looking at the correspondence again, the problem was a bad RDB file, and we don't know how it got generated. so i guess i'll leave it closed, since it's unlikely that we'll ever get to the bottom of it.