Redis Manual Failover on Slave causes Infinite loop

I run redis cluster on kubernetes with 3 masters and 3 slaves. When I kill a master pod in kubernetes or perform a manual failover on slave, the slave gets into infinite loop, trying to sync with master. I run redis:5.0.1-alpine image on all pods. Here is part of logs. This keeps repeating.

1:S 01 Aug 2019 20:02:04.015 * Connecting to MASTER 10.1.1.93:6379 1:S 01 Aug 2019 20:02:04.015 * MASTER <-> REPLICA sync started 1:S 01 Aug 2019 20:02:04.015 * Non blocking connect for SYNC fired the event. 1:S 01 Aug 2019 20:02:04.015 * Master replied to PING, replication can continue... 1:S 01 Aug 2019 20:02:04.015 * Partial resynchronization not possible (no cached master) 1:S 01 Aug 2019 20:02:04.016 * Full resync from master: eb3d3c72a904cf0e9671f2425c8082999138ff15:0 1:S 01 Aug 2019 20:02:04.109 * MASTER <-> REPLICA sync: receiving 175 bytes from master 1:S 01 Aug 2019 20:02:04.114 * MASTER <-> REPLICA sync: Flushing old data 1:S 01 Aug 2019 20:02:04.114 * MASTER <-> REPLICA sync: Loading DB in memory 1:S 01 Aug 2019 20:02:04.114 # Failed trying to load the MASTER synchronization DB from disk 1:S 01 Aug 2019 20:02:05.018 * Connecting to MASTER 10.1.1.93:6379 1:S 01 Aug 2019 20:02:05.018 * MASTER <-> REPLICA sync started 1:S 01 Aug 2019 20:02:05.018 * Non blocking connect for SYNC fired the event. 1:S 01 Aug 2019 20:02:05.018 * Master replied to PING, replication can continue... 1:S 01 Aug 2019 20:02:05.018 * Partial resynchronization not possible (no cached master) 1:S 01 Aug 2019 20:02:05.019 * Full resync from master: eb3d3c72a904cf0e9671f2425c8082999138ff15:0 1:S 01 Aug 2019 20:02:05.113 * MASTER <-> REPLICA sync: receiving 175 bytes from master 1:S 01 Aug 2019 20:02:05.118 * MASTER <-> REPLICA sync: Flushing old data 1:S 01 Aug 2019 20:02:05.118 * MASTER <-> REPLICA sync: Loading DB in memory 1:S 01 Aug 2019 20:02:05.118 # Failed trying to load the MASTER synchronization DB from disk 1:S 01 Aug 2019 20:02:06.021 * Connecting to MASTER 10.1.1.93:6379 1:S 01 Aug 2019 20:02:06.021 * MASTER <-> REPLICA sync started 1:S 01 Aug 2019 20:02:06.021 * Non blocking connect for SYNC fired the event. 1:S 01 Aug 2019 20:02:06.022 * Master replied to PING, replication can continue... 1:S 01 Aug 2019 20:02:06.022 * Partial resynchronization not possible (no cached master) 1:S 01 Aug 2019 20:02:06.023 * Full resync from master: eb3d3c72a904cf0e9671f2425c8082999138ff15:0 1:S 01 Aug 2019 20:02:06.115 * MASTER <-> REPLICA sync: receiving 175 bytes from master 1:S 01 Aug 2019 20:02:06.120 * MASTER <-> REPLICA sync: Flushing old data 1:S 01 Aug 2019 20:02:06.120 * MASTER <-> REPLICA sync: Loading DB in memory 1:S 01 Aug 2019 20:02:06.120 # Failed trying to load the MASTER synchronization DB from disk

Comment From: vattezhang

Do you have sentinel pod placed in k8s and configured right? Redis's failover funciton is based on sentinel.

Comment From: yilmazuksal

My setup is based on the following link. https://rancher.com/blog/2019/deploying-redis-cluster/ In that link it doesn't talk about any sentinel node. I thought sentinel and cluster were 2 different things. In my example, I just create 6 nodes with cluster enabled. Then using cli tool I create the cluster. Even in redis cluster 101 document, I haven't seen any mention of sentinel being required.

Comment From: WiFeng

Have you checked the rdb file in slave node ? It exists ?

Comment From: antirez

After inspecting the relevant code paths, as @WiFeng said, it looks like the RDB file does not exist on the target replica disk. There must be something big wrong in the configuration of Redis, I don't know what.

Comment From: antirez

For instance a trivial thing could be that there is some other process the deletes all the temporary files that Redis is creating. Or Redis is writing in some odd filesystem type that auto wipes or changes names of the files. Really no idea, I've never seen something like that probably in the past.

Comment From: Jacksonary

do you done with it? I have same problem with you, thanks in advance.

Comment From: nmvk

I tried the same setup with EKS, EC2 and did not see any issues. I have posted the log below from the original replica which became master few time during my testing (Manual failover + Master upon termination).

1:C 04 Jun 2021 23:33:26.683 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 04 Jun 2021 23:33:26.683 # Redis version=5.0.1, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 04 Jun 2021 23:33:26.683 # Configuration loaded
sed: /data/nodes.conf: No such file or directory
1:M 04 Jun 2021 23:33:26.685 * No cluster configuration found, I'm 0bf09dfc1b7c967ce22bcc8d2323d487560eb5fc
1:M 04 Jun 2021 23:33:26.690 * Running mode=cluster, port=6379.
1:M 04 Jun 2021 23:33:26.690 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 04 Jun 2021 23:33:26.690 # Server initialized
1:M 04 Jun 2021 23:33:26.690 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 04 Jun 2021 23:33:26.690 * Ready to accept connections
1:M 04 Jun 2021 23:40:36.439 # configEpoch set to 4 via CLUSTER SET-CONFIG-EPOCH
1:M 04 Jun 2021 23:40:36.507 # IP address for this node updated to 172.31.25.177
1:M 04 Jun 2021 23:40:41.483 # Cluster state changed: ok
1:S 04 Jun 2021 23:40:42.460 * Before turning into a replica, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1:S 04 Jun 2021 23:40:42.785 * Connecting to MASTER 172.31.27.24:6379
1:S 04 Jun 2021 23:40:42.785 * MASTER <-> REPLICA sync started
1:S 04 Jun 2021 23:40:42.785 * Non blocking connect for SYNC fired the event.
1:S 04 Jun 2021 23:40:42.785 * Master replied to PING, replication can continue...
1:S 04 Jun 2021 23:40:42.785 * Trying a partial resynchronization (request c6b7cd48725d1659bc535950f011af309d3c3cac:1).
1:S 04 Jun 2021 23:40:42.786 * Full resync from master: afa3a4205e03c3859f4b38bb9c302fb82dcc53d2:0
1:S 04 Jun 2021 23:40:42.786 * Discarding previously cached master state.
1:S 04 Jun 2021 23:40:42.818 * MASTER <-> REPLICA sync: receiving 175 bytes from master
1:S 04 Jun 2021 23:40:42.818 * MASTER <-> REPLICA sync: Flushing old data
1:S 04 Jun 2021 23:40:42.818 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 04 Jun 2021 23:40:42.818 * MASTER <-> REPLICA sync: Finished with success
1:S 04 Jun 2021 23:40:42.819 * Background append only file rewriting started by pid 10
1:S 04 Jun 2021 23:40:42.844 * AOF rewrite child asks to stop sending diffs.
10:C 04 Jun 2021 23:40:42.844 * Parent agreed to stop sending diffs. Finalizing AOF...
10:C 04 Jun 2021 23:40:42.844 * Concatenating 0.00 MB of AOF diff received from parent.
10:C 04 Jun 2021 23:40:42.844 * SYNC append only file rewrite performed
10:C 04 Jun 2021 23:40:42.844 * AOF rewrite: 0 MB of memory used by copy-on-write
1:S 04 Jun 2021 23:40:42.885 * Background AOF rewrite terminated with success
1:S 04 Jun 2021 23:40:42.885 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 04 Jun 2021 23:40:42.885 * Background AOF rewrite finished successfully
1:S 04 Jun 2021 23:47:00.965 # Taking over the master (user request).
1:S 04 Jun 2021 23:47:00.965 # New configEpoch set to 7
1:M 04 Jun 2021 23:47:00.965 # Setting secondary replication ID to afa3a4205e03c3859f4b38bb9c302fb82dcc53d2, valid up to offset: 533. New replication ID is 1d8349d71a733f7f32ac3f8ad2a2ffb986cbb952
1:M 04 Jun 2021 23:47:00.965 # Connection with master lost.
1:M 04 Jun 2021 23:47:00.965 * Caching the disconnected master state.
1:M 04 Jun 2021 23:47:00.965 * Discarding previously cached master state.
1:M 04 Jun 2021 23:47:01.261 * Replica 172.31.27.24:6379 asks for synchronization
1:M 04 Jun 2021 23:47:01.262 * Partial resynchronization request from 172.31.27.24:6379 accepted. Sending 0 bytes of backlog starting from offset 533.
1:M 04 Jun 2021 23:51:59.555 # Connection with replica 172.31.27.24:6379 lost.
1:M 04 Jun 2021 23:51:59.563 # Configuration change detected. Reconfiguring myself as a replica of 40e61be53b97eda863883dea9139bfd865c9b2bd
1:S 04 Jun 2021 23:51:59.563 * Before turning into a replica, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1:S 04 Jun 2021 23:52:00.509 * Connecting to MASTER 172.31.27.24:6379
1:S 04 Jun 2021 23:52:00.510 * MASTER <-> REPLICA sync started
1:S 04 Jun 2021 23:52:00.510 * Non blocking connect for SYNC fired the event.
1:S 04 Jun 2021 23:52:00.510 * Master replied to PING, replication can continue...
1:S 04 Jun 2021 23:52:00.510 * Trying a partial resynchronization (request 1d8349d71a733f7f32ac3f8ad2a2ffb986cbb952:1084).
1:S 04 Jun 2021 23:52:00.511 * Successful partial resynchronization with master.
1:S 04 Jun 2021 23:52:00.511 # Master replication ID changed to d48b708f68e3de12c0430c7b28971f29c19f7b0f
1:S 04 Jun 2021 23:52:00.511 * MASTER <-> REPLICA sync: Master accepted a Partial Resynchronization.
1:S 04 Jun 2021 23:58:49.523 # Connection with master lost.
1:S 04 Jun 2021 23:58:49.523 * Caching the disconnected master state.
1:S 04 Jun 2021 23:58:49.547 * Connecting to MASTER 172.31.27.24:6379
1:S 04 Jun 2021 23:58:49.547 * MASTER <-> REPLICA sync started
1:S 04 Jun 2021 23:58:49.547 # Error condition on socket for SYNC: Connection refused
1:S 04 Jun 2021 23:58:50.550 * Connecting to MASTER 172.31.27.24:6379
1:S 04 Jun 2021 23:58:50.551 * MASTER <-> REPLICA sync started
1:S 04 Jun 2021 23:59:04.940 * FAIL message received from 8aac2ee5f748b1663288bc6e0e3afca6cb102796 about 40e61be53b97eda863883dea9139bfd865c9b2bd
1:S 04 Jun 2021 23:59:04.984 # Start of election delayed for 943 milliseconds (rank #0, offset 1716).
1:S 04 Jun 2021 23:59:05.986 # Starting a failover election for epoch 9.
1:S 04 Jun 2021 23:59:05.993 # Failover election won: I'm the new master.
1:S 04 Jun 2021 23:59:05.993 # configEpoch set to 9 after successful failover
1:M 04 Jun 2021 23:59:05.993 # Setting secondary replication ID to d48b708f68e3de12c0430c7b28971f29c19f7b0f, valid up to offset: 1717. New replication ID is be0e11eee0909bfb4ac7e73062131adaf2e4e63d
1:M 04 Jun 2021 23:59:05.993 * Discarding previously cached master state.
1:M 04 Jun 2021 23:59:19.842 # Address updated for node 40e61be53b97eda863883dea9139bfd865c9b2bd, now 172.31.27.222:6379
1:M 04 Jun 2021 23:59:19.941 * Clear FAIL state for node 40e61be53b97eda863883dea9139bfd865c9b2bd: master without slots is reachable again.
1:M 04 Jun 2021 23:59:20.844 * Replica 172.31.27.222:6379 asks for synchronization
1:M 04 Jun 2021 23:59:20.844 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '7d7bb962fc4adf2248a4fc596eae4240ea53c2a1', my replication IDs are 'be0e11eee0909bfb4ac7e73062131adaf2e4e63d' and 'd48b708f68e3de12c0430c7b28971f29c19f7b0f')
1:M 04 Jun 2021 23:59:20.844 * Starting BGSAVE for SYNC with target: disk
1:M 04 Jun 2021 23:59:20.844 * Background saving started by pid 67
67:C 04 Jun 2021 23:59:20.847 * DB saved on disk
67:C 04 Jun 2021 23:59:20.848 * RDB: 0 MB of memory used by copy-on-write
1:M 04 Jun 2021 23:59:20.944 * Background saving terminated with success
1:M 04 Jun 2021 23:59:20.944 * Synchronization with replica 172.31.27.222:6379 succeeded

Comment From: madolson

@Jacksonary Do you have any more information here? Otherwise going to close this as unable to reproduce.

Comment From: oranagra

maybe for some reason the file system permissions are set in a way that it allows redis to write into that file but not read from it? please respond / re-open if you have more information on the configuration of details on what happened.

p.s. repl-diskless-load can bypass that issue.

Comment From: Jacksonary

@madolson my log just same as topic, no more other info