Describe the bug If a sentinel restarts (e.g. server reboots or sentinel container restarts), the number of sentinels reported by info sentinel is wrong (increases by 1). As a result, sentinel-based setup of 3 sentinels survives only a single failover. After a failover we have to manually reset the sentinels (or, sometimes even restart the entire redis cluster completely to fix this, which leads to additional downtime). If we do not do this, the sentinels eventually fail to form a quorum and subsequent failovers become impossible. Obviously we never know that failover have already occurred and that we need to take a manual action, until we get a production downtime on the second failover. Due to this issue, which lasts for so many releases and was discussed in so many issues there, sentinel-based failover is not reliable enough for production.

To reproduce Note: we connect multiple redis instances to the same sentinels, so that there are many masters and there is a corresponding slave for each master. Let's call them master-slave pairs. 1. Start sentinel-based setup with 3 sentinels on redis 6.2.6. 2. Stop sentinel number 1 and redis masters (imitating a node failure). Wait for a failover to happen. It is OK on this step. 3. Restart sentinel number 1 and redis masters. info sentinel on some of the other sentinels are now reporting 4 sentinels. 3. Wait some time so that the next steps are not affected by sentinel failover timeout. 4. Stop sentinel number 2 and redis slaves. Wait for failover to happen. The failover will fail on some of the redis pairs with the error below. The probability of the issue is approximately 0.25 for us. 5. Repeat steps 2-4, if needed.

Expected behavior 1. Failover succeeded, redis cluster serving requests

Actual behaviour Redis cluster is no longer serving requests and needs a restart. info sentinel reports something similar to this:

master0:name=pair1,status=odown,address=1.1.1.1:12345,slaves=1,sentinels=5

Additional information The issue is reproducible on latest docker image 6.2.6.

Comment From: enjoy-binbin

thanks for the reporting. Let me try reproduce tomorrow

Comment From: RomanKisilenko

I updated the reproduction steps to match exactly what we are doing to reproduce the issue. I'm not sure if the issue is related specifically to wrong sentinel count reported by sentinels because the issue reproduces even if we reset sentinels between steps 3 and 5 above. However restarting the redis cluster resolves the failure.

Comment From: bashanyy

谢谢

Comment From: enjoy-binbin

i can't reproduce (sentinels++). i am not using docker did you happen to change the sentinel ID (or change the sentinel.conf) or (docker volume sentinel conf?)

# something like this in sentinel.conf
# Generated by CONFIG REWRITE
sentinel myid c283bea66ecbb8a0f6201c71cda17cb40df4e22b

if i remember correct, sentinels++ only happen when a new sentinel join, a new sentinel myid join. so i assume that you restart sentinel that the sentinel id changed (a new sentinel conf is used)

Comment From: RomanKisilenko

OK, the wrong count issue is fixed by making sentinel configuration persistent. Thank you for heads up.

However, I still get odown redis pair status in sentinel after following the above steps: master0:name=pair1,status=odown,address=1.1.1.1:12345,slaves=1,sentinels=5

I noticed also that then this happens restarted sentinel often disagrees with other sentinels on what redis instance should be master.

Restarting sentinel works around this issue?

Any ideas? We learned how to reproduce it on staging, so that we can collect logs for you, whatever is required to get it fixed.

Comment From: enjoy-binbin

OK, the wrong count issue is fixed by making sentinel configuration persistent

after this part, will you still get unexpected results?

after a failover, a replica became the new master, the sentinel will rewrite the sentinel conf

sentinel monitor mymaster 127.0.0.1 6380 2

like this line, the maste ip address and port inside will change, so different sentinels may see mymaster differently?

Comment From: RomanKisilenko

after this part, will you still get unexpected results? Yes.

All sentinels are configured with the identical master. I mean that sentinels often disagree if there should be a failover to slave or not. So that sentinel number 1 may elect the original master, but sentinel number 2 can elect original slave to become master. This issue usually precedes the subsequent odown issue which in turn crashes the application.

If you'll suggest what data should I collect I can reproduce the issue and provide you with the required information.

Comment From: enjoy-binbin

can you post some logs, like each sentinel serverlog (stdout / output), something like this, it may help

16085:X 26 Dec 2021 12:25:16.521 # +sdown master mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:16.579 # +odown master mymaster 127.0.0.1 6379 #quorum 2/2
16085:X 26 Dec 2021 12:25:16.580 # +new-epoch 1
16085:X 26 Dec 2021 12:25:16.580 # +try-failover master mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:16.608 # +vote-for-leader 554e08222894e7c56f3f7c9c61394f4f86dde209 1
16085:X 26 Dec 2021 12:25:16.625 # e2af173ed82280bdc6688bab94a26d517379245a voted for 554e08222894e7c56f3f7c9c61394f4f86dde209 1
16085:X 26 Dec 2021 12:25:16.688 # +elected-leader master mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:16.688 # +failover-state-select-slave master mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:16.743 # +selected-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:16.743 * +failover-state-send-slaveof-noone slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:16.818 * +failover-state-wait-promotion slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:17.622 # +promoted-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:17.623 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:17.688 # +failover-end master mymaster 127.0.0.1 6379
16085:X 26 Dec 2021 12:25:17.689 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380

Comment From: RomanKisilenko

OK, so here is what happened on my last attempt to reproduce this issue.

First, I checked that sentinels are OK in initial state and agree on failover state.

sentinel #1
master0:name=shard3,status=ok,address=172.20.0.2:30662,slaves=1,sentinels=3
master1:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
master2:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master3:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
sentinel #2
master0:name=shard3,status=ok,address=172.20.0.2:30662,slaves=1,sentinels=3
master1:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
sentinel #3
master0:name=shard3,status=ok,address=172.20.0.2:30662,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
master2:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master3:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3

Next, I imitated a node reboot. Shut down all services (redis masters and sentinel number 1) supposed to be running on the same node.

sentinel #2
master0:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard1,status=ok,address=172.20.0.2:30659,slaves=1,sentinels=3
sentinel #3
master0:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30659,slaves=1,sentinels=3
master2:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master3:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3

Note, that a failover did happen and remaining sentinels still agree on failover state, so that application is able to connect to redis instances.

Now, I imitated that node is back up by concurrently starting redis masters and sentinel number 1, waited some time for discovery to happen. Note, that I beleive there is some kind of race condition involved so that it is important that the redises and sentinels are to be started concurrently.

sentinel #1
master0:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30659,slaves=1,sentinels=3
master2:name=primary,status=odown,address=172.20.0.2:30692,slaves=1,sentinels=3
master3:name=shard2,status=odown,address=172.20.0.2:30660,slaves=1,sentinels=3
sentinel #2
master0:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard1,status=ok,address=172.20.0.2:30659,slaves=1,sentinels=3
sentinel #3
master0:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30659,slaves=1,sentinels=3
master2:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master3:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3

Note, that sentinel 1 now disagrees on failover state with other sentinels. It also reports the status as odown. I guess if I would start the sentinel first, allowed for discovery to happen and then started the masters, then there would be no disagreement.

At this point the cluster is 1 step from failure. If sentinel #2 or sentinel #3 is stopped now, the application is no longer able to connect to the redises. The redises are healthy, but (it seems to me) since sentinels disagree on failover state, they fail to report current master to application properly (I guess application needs at least 2 sentinels to agree on which redis instance is a master).

Here are the logs from sentinel 1:

26:X 27 Dec 2021 20:49:52.866 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
26:X 27 Dec 2021 20:49:52.866 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=26, just started
26:X 27 Dec 2021 20:49:52.866 # Configuration loaded
26:X 27 Dec 2021 20:49:52.867 * monotonic clock: POSIX clock_gettime
26:X 27 Dec 2021 20:49:52.870 * Running mode=sentinel, port=26379.
26:X 27 Dec 2021 20:49:52.871 # Sentinel ID is 1d9837be48310d84428b1fd268ea740b54726c3b
26:X 27 Dec 2021 20:49:52.871 # +monitor master shard3 172.20.0.2 30662 quorum 1
26:X 27 Dec 2021 20:49:52.871 # +monitor master shard1 172.20.0.2 30658 quorum 1
26:X 27 Dec 2021 20:49:52.871 # +monitor master primary 172.20.0.2 30692 quorum 1
26:X 27 Dec 2021 20:49:52.871 # +monitor master shard2 172.20.0.2 30660 quorum 1
26:X 27 Dec 2021 20:49:54.649 # +new-epoch 108
26:X 27 Dec 2021 20:49:54.649 # +config-update-from sentinel 5febcad9eba3491f8ce04365bd544d7c4a770c2a 172.20.0.2 30665 @ shard1 172.20.0.2 30658
26:X 27 Dec 2021 20:49:54.649 # +switch-master shard1 172.20.0.2 30658 172.20.0.2 30659
26:X 27 Dec 2021 20:49:54.649 * +slave slave 172.20.0.2:30658 172.20.0.2 30658 @ shard1 172.20.0.2 30659
26:X 27 Dec 2021 20:49:54.787 # +config-update-from sentinel 5febcad9eba3491f8ce04365bd544d7c4a770c2a 172.20.0.2 30665 @ shard3 172.20.0.2 30662
26:X 27 Dec 2021 20:49:54.787 # +switch-master shard3 172.20.0.2 30662 172.20.0.2 30663
26:X 27 Dec 2021 20:49:54.788 * +slave slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
26:X 27 Dec 2021 20:49:57.895 # +sdown master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:49:57.895 # +odown master primary 172.20.0.2 30692 #quorum 1/1
26:X 27 Dec 2021 20:49:57.895 # +new-epoch 109
26:X 27 Dec 2021 20:49:57.895 # +try-failover master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:49:57.899 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 109
26:X 27 Dec 2021 20:49:57.899 # +sdown slave 172.20.0.2:30692 172.20.0.2 30692 @ primary 172.20.0.2 30692
26:X 27 Dec 2021 20:49:58.169 # -sdown master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:49:58.169 # -odown master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:49:58.170 # -sdown slave 172.20.0.2:30692 172.20.0.2 30692 @ primary 172.20.0.2 30692
26:X 27 Dec 2021 20:50:05.956 * +convert-to-slave slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
26:X 27 Dec 2021 20:50:07.695 * +convert-to-slave slave 172.20.0.2:30658 172.20.0.2 30658 @ shard1 172.20.0.2 30659
26:X 27 Dec 2021 20:50:08.901 # -failover-abort-not-elected master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:50:21.051 # +sdown master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:50:21.051 # +odown master shard2 172.20.0.2 30660 #quorum 1/1
26:X 27 Dec 2021 20:50:21.051 # +new-epoch 110
26:X 27 Dec 2021 20:50:21.051 # +try-failover master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:50:21.055 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 110
26:X 27 Dec 2021 20:50:23.106 # +sdown master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:50:23.106 # +odown master primary 172.20.0.2 30692 #quorum 1/1
26:X 27 Dec 2021 20:50:23.106 # Next failover delay: I will not start a failover before Mon Dec 27 20:51:58 2021
26:X 27 Dec 2021 20:50:31.140 # -failover-abort-not-elected master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:50:31.216 # Next failover delay: I will not start a failover before Mon Dec 27 20:52:21 2021
26:X 27 Dec 2021 20:51:58.885 # +new-epoch 111
26:X 27 Dec 2021 20:51:58.885 # +try-failover master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:51:58.903 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 111
26:X 27 Dec 2021 20:52:09.480 # -failover-abort-not-elected master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:52:09.552 # Next failover delay: I will not start a failover before Mon Dec 27 20:53:59 2021
26:X 27 Dec 2021 20:52:21.203 # +new-epoch 112
26:X 27 Dec 2021 20:52:21.203 # +try-failover master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:52:21.207 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 112
26:X 27 Dec 2021 20:52:31.830 # -failover-abort-not-elected master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:52:31.882 # Next failover delay: I will not start a failover before Mon Dec 27 20:54:21 2021
26:X 27 Dec 2021 20:53:59.498 # +new-epoch 113
26:X 27 Dec 2021 20:53:59.498 # +try-failover master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:53:59.501 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 113
26:X 27 Dec 2021 20:54:09.674 # -failover-abort-not-elected master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:54:09.740 # Next failover delay: I will not start a failover before Mon Dec 27 20:55:59 2021
26:X 27 Dec 2021 20:54:21.826 # +new-epoch 114
26:X 27 Dec 2021 20:54:21.827 # +try-failover master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:54:21.830 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 114
26:X 27 Dec 2021 20:54:32.460 # -failover-abort-not-elected master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:54:32.519 # Next failover delay: I will not start a failover before Mon Dec 27 20:56:22 2021
26:X 27 Dec 2021 20:55:59.703 # +new-epoch 115
26:X 27 Dec 2021 20:55:59.703 # +try-failover master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:55:59.707 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 115
26:X 27 Dec 2021 20:56:09.921 # -failover-abort-not-elected master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:56:09.979 # Next failover delay: I will not start a failover before Mon Dec 27 20:57:59 2021
26:X 27 Dec 2021 20:56:22.451 # +new-epoch 116
26:X 27 Dec 2021 20:56:22.451 # +try-failover master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:56:22.455 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 116
26:X 27 Dec 2021 20:56:32.634 # -failover-abort-not-elected master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:56:32.734 # Next failover delay: I will not start a failover before Mon Dec 27 20:58:22 2021
26:X 27 Dec 2021 20:58:00.001 # +new-epoch 117
26:X 27 Dec 2021 20:58:00.001 # +try-failover master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:58:00.005 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 117
26:X 27 Dec 2021 20:58:11.041 # -failover-abort-not-elected master primary 172.20.0.2 30692
26:X 27 Dec 2021 20:58:11.119 # Next failover delay: I will not start a failover before Mon Dec 27 21:00:00 2021
26:X 27 Dec 2021 20:58:22.613 # +new-epoch 118
26:X 27 Dec 2021 20:58:22.613 # +try-failover master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:58:22.617 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 118
26:X 27 Dec 2021 20:58:33.057 # -failover-abort-not-elected master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:58:33.141 # Next failover delay: I will not start a failover before Mon Dec 27 21:00:23 2021
26:X 27 Dec 2021 21:00:00.992 # +new-epoch 119
26:X 27 Dec 2021 21:00:00.992 # +try-failover master primary 172.20.0.2 30692
26:X 27 Dec 2021 21:00:00.995 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 119
26:X 27 Dec 2021 21:00:11.501 # -failover-abort-not-elected master primary 172.20.0.2 30692
26:X 27 Dec 2021 21:00:11.585 # Next failover delay: I will not start a failover before Mon Dec 27 21:02:01 2021
26:X 27 Dec 2021 21:00:23.094 # +new-epoch 120
26:X 27 Dec 2021 21:00:23.094 # +try-failover master shard2 172.20.0.2 30660
26:X 27 Dec 2021 21:00:23.133 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 120
26:X 27 Dec 2021 21:00:33.984 # -failover-abort-not-elected master shard2 172.20.0.2 30660
26:X 27 Dec 2021 21:00:34.056 # Next failover delay: I will not start a failover before Mon Dec 27 21:02:23 2021
26:X 27 Dec 2021 21:02:01.521 # +new-epoch 121
26:X 27 Dec 2021 21:02:01.521 # +try-failover master primary 172.20.0.2 30692
26:X 27 Dec 2021 21:02:01.524 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 121
26:X 27 Dec 2021 21:02:12.223 # -failover-abort-not-elected master primary 172.20.0.2 30692
26:X 27 Dec 2021 21:02:12.323 # Next failover delay: I will not start a failover before Mon Dec 27 21:04:02 2021
26:X 27 Dec 2021 21:02:24.018 # +new-epoch 122
26:X 27 Dec 2021 21:02:24.018 # +try-failover master shard2 172.20.0.2 30660
26:X 27 Dec 2021 21:02:24.022 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 122
26:X 27 Dec 2021 21:02:34.154 # -failover-abort-not-elected master shard2 172.20.0.2 30660
26:X 27 Dec 2021 21:02:34.212 # Next failover delay: I will not start a failover before Mon Dec 27 21:04:24 2021

Please let me know if any further information is required.

Comment From: RomanKisilenko

Hello, are there any insights available on this issue? Let me know if I can help any further.

Comment From: enjoy-binbin

sorry, busy with other things, i will take a look again this weekend. also ping @hwware if you have some times, can also share some thoughts :)

Comment From: enjoy-binbin

This part of the log looks a bit strange, there are many such block logs. - new-epoch means that sentinel1 rec the other sentinel HELLO message, a newer epoch - +vote-for-leader note that sentinel 1 voted for himself, but the following logs looks a bit strange, so i suppose we didn't reach the quorm ( sentinels_count / 2 + 1) - -failover-abort-not-elected after 10s, the failover was aborted. sentinel1 is not the leader, not able to continue failover.

26:X 27 Dec 2021 20:52:21.203 # +new-epoch 112
26:X 27 Dec 2021 20:52:21.203 # +try-failover master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:52:21.207 # +vote-for-leader 1d9837be48310d84428b1fd268ea740b54726c3b 112
26:X 27 Dec 2021 20:52:31.830 # -failover-abort-not-elected master shard2 172.20.0.2 30660
26:X 27 Dec 2021 20:52:31.882 # Next failover delay: I will not start a failover before Mon Dec 27 20:54:21 2021

I also noticed, the quorum is 1, I see that the number of sentries you configure is three, this should be 2?

26:X 27 Dec 2021 20:49:52.871 # +monitor master shard3 172.20.0.2 30662 quorum 1
26:X 27 Dec 2021 20:49:52.871 # +monitor master shard1 172.20.0.2 30658 quorum 1
26:X 27 Dec 2021 20:49:52.871 # +monitor master primary 172.20.0.2 30692 quorum 1
26:X 27 Dec 2021 20:49:52.871 # +monitor master shard2 172.20.0.2 30660 quorum 1

I will send out what I know first, I suggest you set the quorum to 2 (if you have 3 sentinels) and then take a another look. I also suppose we need a log, like in this case, sentinel1 voted for himself, but it looks like it didn’t get enough votes (>=2), so sentinel 1 cannot be called the leader, and the code set leader_winner = NULL without a log

# sentinel.conf
sentinel monitor mymaster 127.0.0.1 6379 2

# Note that whatever is the ODOWN quorum, a Sentinel will require to
# be elected by the majority of the known Sentinels in order to
# start a failover, so no failover can be performed in minority.

Comment From: RomanKisilenko

I tried already with quorum value of 2, the situation is pretty much the same - on the second failover sentinels report sdown/odown status and the app fails to connect to redis. I can collect the updated logs (with quorum=2) if this can help.

Comment From: enjoy-binbin

sure, please collect the logs. and this time, maybe we can also the other sentinels logs I am also curious about the output of other sentries...

I am no expert to sentinels, i will try to response what i know, and in the meantime we can wait for @hwware , he knows the sentinels better than I do

Comment From: RomanKisilenko

OK, so I took another attempt to reproduce this and collected the logs along the way.

step 1: started a clean cluster step 2: stopped server nr. 1 with first sentinel and masters and then restarted it step 3: stopped server nr. 2 step 4: started server nr. 2. Strange thing: sentinel nr. 2 now reporting 2 slaves (should be 1):

sentinel #1
master0:name=shard2,status=ok,address=172.20.0.2:30660,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
master2:name=shard3,status=ok,address=172.20.0.2:30662,slaves=1,sentinels=3
master3:name=primary,status=ok,address=172.20.0.2:30692,slaves=1,sentinels=3
sentinel #2
master0:name=shard1,status=ok,address=172.20.0.2:30658,slaves=2,sentinels=3
master1:name=shard3,status=ok,address=172.20.0.2:30662,slaves=2,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30692,slaves=2,sentinels=3
master3:name=shard2,status=ok,address=172.20.0.2:30660,slaves=2,sentinels=3
sentinel #3
master0:name=shard3,status=ok,address=172.20.0.2:30662,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30692,slaves=1,sentinels=3
master3:name=shard2,status=ok,address=172.20.0.2:30660,slaves=1,sentinels=3

step 5: stopped server nr. 1 step 6: started server nr. 1. Strange behaviour: now sentinels disagree on failover state.

sentinel #1
master0:name=shard2,status=ok,address=172.20.0.2:30660,slaves=1,sentinels=3
master1:name=primary,status=ok,address=172.20.0.2:30692,slaves=1,sentinels=3
master2:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
master3:name=shard3,status=ok,address=172.20.0.2:30662,slaves=1,sentinels=3
sentinel #2
master0:name=shard1,status=ok,address=172.20.0.2:30659,slaves=1,sentinels=3
master1:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
sentinel #3
master0:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30659,slaves=1,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3

step 7: stopped server nr. 2. The cluster is broken now

sentinel #1
master0:name=shard2,status=sdown,address=172.20.0.2:30660,slaves=1,sentinels=3
master1:name=primary,status=sdown,address=172.20.0.2:30692,slaves=1,sentinels=3
master2:name=shard1,status=sdown,address=172.20.0.2:30658,slaves=1,sentinels=3
master3:name=shard3,status=sdown,address=172.20.0.2:30662,slaves=1,sentinels=3
sentinel #2
Error from server (NotFound): pods "ps-rs2-0" not found
sentinel #3
master0:name=shard3,status=sdown,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard1,status=sdown,address=172.20.0.2:30659,slaves=1,sentinels=3
master2:name=primary,status=sdown,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard2,status=sdown,address=172.20.0.2:30661,slaves=1,sentinels=3

step 8: restarted server nr. 2. The cluster back to healthy but there is still sentinel problem. Note, that the cluster sometimes does not recover at all, this time I was "lucky" that it recovered

sentinel #1
master0:name=shard2,status=sdown,address=172.20.0.2:30660,slaves=1,sentinels=3
master1:name=primary,status=sdown,address=172.20.0.2:30692,slaves=1,sentinels=3
master2:name=shard1,status=sdown,address=172.20.0.2:30658,slaves=1,sentinels=3
master3:name=shard3,status=sdown,address=172.20.0.2:30662,slaves=1,sentinels=3
sentinel #2
master0:name=shard3,status=ok,address=172.20.0.2:30662,slaves=1,sentinels=3
master1:name=shard2,status=ok,address=172.20.0.2:30660,slaves=1,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30692,slaves=1,sentinels=3
master3:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
sentinel #3
master0:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30659,slaves=1,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3

Logs are attached: https://we.tl/t-NVJ2kgoeLb I collected the logs for both redis instances and sentinels.

Comment From: hwware

@enjoy-binbin Thanks, I will take a look recently. @RomanKisilenko

Comment From: hwware

@RomanKisilenko Sorry for late replying to you. I check most of your logs, I found in your cluster, only 2 redis instances right? one master and one replic, because for the log on sentinel: most showing as: master0:name=shard2,status=sdown,address=172.20.0.2:30660,slaves=1,sentinels=3

Please confirm this， Thanks

Comment From: RomanKisilenko

@hwware Yes, for each of the redises added to sentinels there is just a single master and single slave.

Comment From: hwware

This is my reproduce result (only till to your step 4)

First, create a clean cluster with the following instances: 1 master, 1 replica, 3 sentinels. Following is the config：

master.conf

port 6381 bind 127.0.0.1 protected-mode no

replica1.conf

port 6382 bind 127.0.0.1 protected-mode no replicaof 127.0.0.1 6381

sentinel1.conf

port 26381 bind 127.0.0.1 sentinel monitor mymaster 127.0.0.1 6381 2 sentinel down-after-milliseconds mymaster 5000 sentinel failover-timeout mymaster 60000

sentinel2.conf

port 26382 bind 127.0.0.1 sentinel monitor mymaster 127.0.0.1 6381 2 sentinel down-after-milliseconds mymaster 5000 sentinel failover-timeout mymaster 60000

sentinel3.conf

port 26383 bind 127.0.0.1 sentinel monitor mymaster 127.0.0.1 6381 2 sentinel down-after-milliseconds mymaster 5000 sentinel failover-timeout mymaster 60000

Following is my reproduced steps: step 1: started a clean cluster: 1 master, 1 replica, 3 sentinels step 2: stop master(127.0.0.1: 6381) and first sentinel. After failover, restart old master instance and sentinel 1 step 3: Wait sometime, stop new master instance (127.0.0.1:6382) and do the failover step4: restart 127.0.0.1:6382 instance.

Following is the detail log in the sentinel3 because sentinel 3 never stops:

21356:X 10 Jan 2022 11:11:44.977 * +sentinel sentinel db86adec1416aeb6ea2e8cc2dbcc997a44f99d06 127.0.0.1 26381 @ mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:11:51.059 * +sentinel sentinel 04e6daf544338c376fb40ba1c28604471c662c36 127.0.0.1 26383 @ mymaster 127.0.0.1 6381 // Starting to stop sentinel and master 127.0.0.1:6381 This line is not log, I add it for comments 21356:X 10 Jan 2022 11:16:37.390 # +sdown sentinel db86adec1416aeb6ea2e8cc2dbcc997a44f99d06 127.0.0.1 26381 @ mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:40.603 # +sdown master mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:40.684 # +odown master mymaster 127.0.0.1 6381 #quorum 2/2 21356:X 10 Jan 2022 11:16:40.684 # +new-epoch 1 21356:X 10 Jan 2022 11:16:40.684 # +try-failover master mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:40.687 # +vote-for-leader 0fb07518d8779a5b3ce02d500fa930b5aa7fd2f1 1 21356:X 10 Jan 2022 11:16:40.690 # 04e6daf544338c376fb40ba1c28604471c662c36 voted for 0fb07518d8779a5b3ce02d500fa930b5aa7fd2f1 1 21356:X 10 Jan 2022 11:16:40.745 # +elected-leader master mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:40.745 # +failover-state-select-slave master mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:40.815 # +selected-slave slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:40.815 * +failover-state-send-slaveof-noone slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:40.917 * +failover-state-wait-promotion slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:41.777 # +promoted-slave slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:41.777 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:41.843 # +failover-end master mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:16:41.843 # +switch-master mymaster 127.0.0.1 6381 127.0.0.1 6382 21356:X 10 Jan 2022 11:16:41.843 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6382 21356:X 10 Jan 2022 11:16:46.855 # +sdown slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6382 // Starting to restart sentinel and master 127.0.0.1:6381 This line is not log, I add it for comments 21356:X 10 Jan 2022 11:18:18.577 # -sdown slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6382 21356:X 10 Jan 2022 11:18:28.540 * +convert-to-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6382 21356:X 10 Jan 2022 11:18:39.322 # -sdown sentinel db86adec1416aeb6ea2e8cc2dbcc997a44f99d06 127.0.0.1 26381 @ mymaster 127.0.0.1 6382

// start restart 127.0.0.1:6382 21356:X 10 Jan 2022 11:20:36.620 # +sdown master mymaster 127.0.0.1 6382 21356:X 10 Jan 2022 11:20:36.680 # +new-epoch 2 21356:X 10 Jan 2022 11:20:36.681 # +vote-for-leader 04e6daf544338c376fb40ba1c28604471c662c36 2 21356:X 10 Jan 2022 11:20:36.683 # +odown master mymaster 127.0.0.1 6382 #quorum 3/2 21356:X 10 Jan 2022 11:20:36.683 # Next failover delay: I will not start a failover before Mon Jan 10 11:22:37 2022 21356:X 10 Jan 2022 11:20:37.787 # +config-update-from sentinel 04e6daf544338c376fb40ba1c28604471c662c36 127.0.0.1 26383 @ mymaster 127.0.0.1 6382 21356:X 10 Jan 2022 11:20:37.787 # +switch-master mymaster 127.0.0.1 6382 127.0.0.1 6381 21356:X 10 Jan 2022 11:20:37.787 * +slave slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:20:42.831 # +sdown slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:21:39.482 # -sdown slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381 21356:X 10 Jan 2022 11:21:49.427 * +convert-to-slave slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381

Below is the detail information from 3 sentinel clients after all these steps:

127.0.0.1:26381> info sentinel

Sentinel

sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 master0:name=mymaster,status=ok,address=127.0.0.1:6381,slaves=1,sentinels=3 127.0.0.1:26381>

127.0.0.1:26382> info sentinel

Sentinel

127.0.0.1:26383> info sentinel

Sentinel

I run them in 6.2.6 version. I can not find the sentinel which display 2 slaves.
Please check my reproduce instance config and reproduce steps, let me know any concern. Thanks a lot.

Comment From: RomanKisilenko

The difference might be related to the fact that in my environment redis and sentinel are started concurrently, also to the fact that I'm running them inside docker containers.

Comment From: hwware

I am not sure how the effect in docker container. but I suggest you startup the redis and sentinel one by one, give them sometime to exchange the information. Thanks If the issue still could be reproduced in your side, just let me know. Do not worry.

Comment From: moticless

Hi @RomanKisilenko , I didn't read entirely the thread (hope i will have more time tomorrow), but I suspect that your configuration files in the containers are not persisted across restarts. That is, an instance fails to read its identity that was written into configuration file prior restart.

Here are few reasons, why it might happen with respect to containers: 1. Configuration file is written to non-persistent volume. On container restart, the instance will allocate new ID. 2. Configuration file is written into a persistent volume that shared among containers such that, on restart, multiple containers rely on the same configuration file. (For example scale=3 of sentinels that all unfortunately read and write the very same file).

In other words, to my best understanding (will verify it tomorrow...), once ID is allocated to an instance and written to a configuration file, it must persist over restarts.

Comment From: RomanKisilenko

@moticless sentinels store configuration files on a persistent storage which is independent for each of the sentilels and the sentinels have permission to write to the config files. Same true for redis instances, but they can not write to config file since I did not expect that redis instances will need to modify config. Should I let the redis instances to write to be able to write to config as well (I do not see any error in redis logs)? I can try to let them write to config bit later on today, to see if it makes any difference.

Comment From: moticless

Please do. And verify that once the instances are running, the configuration file appended with details of the form:

 # Generated by CONFIG REWRITE
 dir "/root"
 user default on nopass ~* &* +@all
 sentinel myid 0f67d4e5b64711e977ba64942bd3cfefc90e2ef3
 sentinel config-epoch mymaster 0
 sentinel leader-epoch mymaster 0
 sentinel current-epoch 0
 sentinel known-replica mymaster 172.20.0.12 6379
 sentinel known-replica mymaster 172.20.0.11 6379

Please verify also that each instance has its own copy to rewrite. For example sentinel number 3 will always uniquely rewrite sentinel3.conf, etc.

Comment From: RomanKisilenko

OK, thank you for your input.

So I made sure that configs persists across sentinel and redis restarts, that they are stored on different disks, and made sure both redis and sentinel can write to config now. Then I executed 3 tests as described below.

Test 1. Stopping and restarting sentinels one by one. The cluster seems to be working in a stable manner. Test 2. Stopping and restarting redis replicas one by one. First masters, then slaves. The cluster seems to be working in a stable manner, failover works as expected. Test 3. Imitating server failures (stopping and restarting redises with sentinels). Few iterations went just fine and I even though that the issue is resolved. But after 4 complete iterations suddenly:

sentinel #1
master0:name=shard2,status=sdown,address=172.20.0.2:30660,slaves=1,sentinels=3
master1:name=shard3,status=sdown,address=172.20.0.2:30662,slaves=1,sentinels=3
master2:name=primary,status=sdown,address=172.20.0.2:30692,slaves=1,sentinels=3
master3:name=shard1,status=sdown,address=172.20.0.2:30658,slaves=1,sentinels=3
sentinel #2
master0:name=shard2,status=ok,address=172.20.0.2:30660,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
master2:name=shard3,status=ok,address=172.20.0.2:30662,slaves=1,sentinels=3
master3:name=primary,status=ok,address=172.20.0.2:30692,slaves=1,sentinels=3
sentinel #3
master0:name=shard2,status=sdown,address=172.20.0.2:30661,slaves=1,sentinels=3
master1:name=shard1,status=sdown,address=172.20.0.2:30659,slaves=1,sentinels=3
master2:name=primary,status=sdown,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard3,status=sdown,address=172.20.0.2:30663,slaves=1,sentinels=3

And cluster did not restore to the working state, even after all services are up.

I think making redis config writable has fixed the issue when number of slaves was reported as 2. But the cluster failure issue is still there.

I'm not sure if Test 2 indeed passes or I just did not try sufficiently many times for the issue to happen.

Comment From: RomanKisilenko

Update: while I was writing above message, the cluster did recover into a stable state. It just took longer, probably a few minutes on this iteration.

I then imitated a few more failures and hit this issue again:

sentinel #1
master0:name=shard3,status=sdown,address=172.20.0.2:30662,slaves=1,sentinels=3
master1:name=primary,status=sdown,address=172.20.0.2:30692,slaves=1,sentinels=3
master2:name=shard2,status=sdown,address=172.20.0.2:30660,slaves=1,sentinels=3
master3:name=shard1,status=sdown,address=172.20.0.2:30658,slaves=1,sentinels=3
sentinel #2
Error from server (NotFound): pods "ps-rs2-0" not found
sentinel #3
master0:name=shard2,status=sdown,address=172.20.0.2:30661,slaves=1,sentinels=3
master1:name=shard1,status=sdown,address=172.20.0.2:30659,slaves=1,sentinels=3
master2:name=primary,status=sdown,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard3,status=sdown,address=172.20.0.2:30663,slaves=1,sentinels=3

Then I waited 5 minutes and the cluster did not recover. (checked later that on this failure after waiting 15 minutes the cluster still not recovered)

Then I restarted all stopped redises and sentinels and waited 10 minutes. Cluster recovered partially:

sentinel #1
master0:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
master2:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master3:name=shard1,status=odown,address=172.20.0.2:30658,slaves=1,sentinels=3
sentinel #2
master0:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master2:name=shard1,status=odown,address=172.20.0.2:30658,slaves=1,sentinels=3
master3:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
sentinel #3
master0:name=shard2,status=ok,address=172.20.0.2:30661,slaves=1,sentinels=3
master1:name=shard1,status=sdown,address=172.20.0.2:30659,slaves=1,sentinels=3
master2:name=primary,status=ok,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard3,status=ok,address=172.20.0.2:30663,slaves=1,sentinels=3

After waiting a few more minutes cluster recovered fully.

Comment From: moticless

@RomanKisilenko Please supply the configuration of all your components. I would like to review it and maybe try reproduce it.

Comment From: RomanKisilenko

These are starting redis/sentinels configurations: Master redis:

requirepass pass
masterauth pass

slave-announce-ip 172.20.0.2
slave-announce-port 30692

maxclients 100000
timeout 300

Slave redis:

requirepass pass
masterauth pass

slaveof 172.20.0.2 30692

slave-announce-ip 172.20.0.2
slave-announce-port 30693

maxclients 100000
timeout 300

Sentinel 1:

requirepass pass
sentinel monitor primary 172.20.0.2 30692 2

sentinel down-after-milliseconds primary 5000
sentinel failover-timeout primary 60000
sentinel parallel-syncs primary 1

sentinel auth-pass primary pass
sentinel announce-ip 172.20.0.2
sentinel announce-port 30664

maxclients 100000
timeout 300

Sentinel 2:

requirepass pass
sentinel monitor primary 172.20.0.2 30692 2

sentinel down-after-milliseconds primary 5000
sentinel failover-timeout primary 60000
sentinel parallel-syncs primary 1

sentinel auth-pass primary pass
sentinel announce-ip 172.20.0.2
sentinel announce-port 30665

maxclients 100000
timeout 300

Sentinel 3:

requirepass pass
sentinel monitor primary 172.20.0.2 30692 2

sentinel down-after-milliseconds primary 5000
sentinel failover-timeout primary 60000
sentinel parallel-syncs primary 1

sentinel auth-pass primary pass
sentinel announce-ip 172.20.0.2
sentinel announce-port 30666

maxclients 100000
timeout 300

Comment From: moticless

As it looks like you are working in NAT environment, which explains why you are declaring sentinel announce-ip/port. But we also need to configure similarly for replicas :

replica-announce-ip 5.5.5.5
replica-announce-port 1234

Please read here and try it out.

Comment From: RomanKisilenko

I believe replica in redis config is an alias of slave, however I went forward and updated my configs with replica instead of slave and re-run the tests.

Just as before: 1. Stopping then restarting after some time just sentinels or just redis replicas produces correct failover and does not lead to an issue. I initiated 10 failovers, all went just fine. 2. Stopping and then restarting both redis replicas and a sentinel (scenario with 3 failure domains for a redis cluster) again leads to an issue just like described above. This time after 3 successful failovers it produced the following cluster state:

sentinel #1
master0:name=shard2,status=sdown,address=172.20.0.2:30660,slaves=1,sentinels=3
master1:name=primary,status=sdown,address=172.20.0.2:30692,slaves=1,sentinels=3
master2:name=shard3,status=sdown,address=172.20.0.2:30662,slaves=1,sentinels=3
master3:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
sentinel #2
Error from server (NotFound): pods "ps-rs2-0" not found
sentinel #3
master0:name=shard3,status=sdown,address=172.20.0.2:30663,slaves=1,sentinels=3
master1:name=shard1,status=ok,address=172.20.0.2:30658,slaves=1,sentinels=3
master2:name=primary,status=sdown,address=172.20.0.2:30693,slaves=1,sentinels=3
master3:name=shard2,status=sdown,address=172.20.0.2:30661,slaves=1,sentinels=3

And it did not start to recover during 10 minutes.

Taking test 1 into account above I think that using at least 5 failure domains for deploying redis cluster (3 for sentinels and 2 for redis replicas) might be an appropriate workaround.

Comment From: moticless

Can it be that your instances might change IP on restart? (Sentinel based hostnames doesn't support dynamic IP unfortunately. Pease see pending Issue #9103)

Comment From: RomanKisilenko

Yes, since this setup is running inside a kubernetes cluster, each of the instance (internal, from 10.x.x.x subnet) IP changes frequently when the service is restarted. However the announced IP-port pair never changes, the kubernetes cluster routing/serivices are configured properly for this. Note, that I do not use announce-hostnames, I use announce-ip everywhere instead.

Comment From: hwware

@RomanKisilenko How about you try to reproduce your issue in the non-docker environment with same redis and sentinel configurations? If it always happens in the non-docker environment , it will be highly possible belonging to the redis issue. Otherwise, we can hardly check if the issue comes from redis or docker side.

Thanks.

Comment From: moticless

@RomanKisilenko , I failed to reproduce this issue with the configuration you supplied. You can find more details in here.

Comment From: RomanKisilenko

We can not seem to reproduce this using docker-compose as well.

Comment From: hwware

@RomanKisilenko Can we close this issue now? Thanks

Comment From: RomanKisilenko

Please do whatever you are supposed to do. The issue still actual for me, however for now I worked around it by splitting sentinels to run on separate hosts.

Comment From: yossigo

@hwware I don't think we should close this until we understand the root cause, it could be a configuration issue but also a bug.

@RomanKisilenko I noticed that when you experience this issue, there's also a discrepancy between sentinels around the port assigned to the master. I suppose the addr/port we see is the one loaded from configuration and represents the correct state for this sentinel before shutting down (and before that master was failed over), but it is not clear what prevents this from being refreshed - a bug or a network issue.

Is this something that can be easily reproduced on a k3s/microk8s/other small scale Kubernetes environment? If it is and you can provide the necessary yaml files I think it can greatly help to make progress with this.

Comment From: RomanKisilenko

Thank you, I'll try to setup an minimalistic example kubernetes cluster which can be used to reproduce the issue. It can take some time before I'll arrange some time to do it, but I hope I will be able to do it soon.

Comment From: moticless

@RomanKisilenko , I checked today what happen to Sentinel cluster once adding latency to the network (200msec with tc). I managed after a few tries of failovers to reach 3 consecutive failures of leader election (-failover-abort-not-elected) which resemble the sentinel1 log that you showed above (Follow repetitive logs Next failover delay: I will not start a failover before...).

As each sentinel waits for a random period of time and attempts the same procedure again, in case of collision, the randomness is fairly limited to 1 second (after waiting failover-timeout) have yet high probability to collision to reoccur, especially if there is network latency.

Since you are working with heavyweight k8s whereas redis ci/cd (or my docker-compose env) are running on a single machine, it might explains why we have failed to reproduce it.

One thing you can try is to have shorter retries: sentinel down-after-milliseconds mymaster 5000 sentinel failover-timeout mymaster 60000

Can you try please also measure network latency of your system?

Comment From: moticless

@RomanKisilenko, I think i managed to reproduce your issue.

Just like in your scenario, based on consensus algorithm, all sentinel instances, randomize a time to initiate their next attempt to become the leader of the group. But time after time, all raffled the same value.

The problem is with this line: srandom(time(NULL)^getpid()); which at start each instance set its own seed for randomization - but all have the same seed! It is rather easy to have same value in seconds between containers that spins up at the same time. and the PID it is always 1 in containers.

Comment From: RomanKisilenko

Thank you, I guess these might be excellent news. And sorry for delay in response, I missed your previous message somehow. If it is still matters, I think latency on my system should be extremely low as our staging kubernetes cluster I used to reproduce the issue on runs on a single server and CPU load is around 10%. If you still want me to set up a test kubernetes cluster to reproduce the issue on, please let me know.

Comment From: moticless

There is no need. I will create PR on sunday.

Comment From: moticless

The PR: https://github.com/redis/redis/pull/10197

Comment From: moticless

@RomanKisilenko, on second thought, it can be helpful if you can also try reproduce with pr https://github.com/redis/redis/pull/10197 to verify you are experiencing the same issue. Thanks.

Comment From: RomanKisilenko

@moticless I guess I'll have to build a docker image with updated sentinel, because there is no one available. Are there any instruction on how can I do this? I can see there is 7.0-rc1 was tagged yesterday, so probably I can use it somehow.

Comment From: yossigo

@RomanKisilenko I see that the docker-library repository was updated to support 7.0-rc, I suppose the official Docker image should be up soon.

Comment From: RomanKisilenko

@yossigo I tested on redis 7.0-rc and was able to reproduce the issue again. After several failovers sentinels disagree on failover state and fail to agree on failover state when there is a majority of sentinels available (2 of 3).

26:X 07 Feb 2022 20:22:20.077 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
26:X 07 Feb 2022 20:22:20.078 # Redis version=6.9.240, bits=64, commit=00000000, modified=0, pid=26, just started
26:X 07 Feb 2022 20:22:20.078 # Configuration loaded
26:X 07 Feb 2022 20:22:20.078 * monotonic clock: POSIX clock_gettime
26:X 07 Feb 2022 20:22:20.082 * Running mode=sentinel, port=26379.
26:X 07 Feb 2022 20:22:20.083 # Sentinel ID is 88b5d5327f31e14e0d52eafc8527cb18fae7589f
26:X 07 Feb 2022 20:22:20.083 # +monitor master shard1 172.20.0.2 30658 quorum 2
26:X 07 Feb 2022 20:22:20.083 # +monitor master shard3 172.20.0.2 30662 quorum 2
26:X 07 Feb 2022 20:22:20.083 # +monitor master shard2 172.20.0.2 30660 quorum 2
26:X 07 Feb 2022 20:22:20.083 # +monitor master primary 172.20.0.2 30692 quorum 2
26:X 07 Feb 2022 20:22:45.104 # +sdown master shard1 172.20.0.2 30658
26:X 07 Feb 2022 20:22:45.104 # +sdown master shard3 172.20.0.2 30662
26:X 07 Feb 2022 20:22:45.104 # +sdown master shard2 172.20.0.2 30660
26:X 07 Feb 2022 20:22:45.104 # +sdown master primary 172.20.0.2 30692
26:X 07 Feb 2022 20:23:44.021 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard1 172.20.0.2 30658
26:X 07 Feb 2022 20:23:44.021 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard3 172.20.0.2 30662
26:X 07 Feb 2022 20:23:44.021 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard2 172.20.0.2 30660
26:X 07 Feb 2022 20:23:44.021 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ primary 172.20.0.2 30692

22:X 07 Feb 2022 20:11:28.931 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
22:X 07 Feb 2022 20:11:28.931 # Redis version=6.9.240, bits=64, commit=00000000, modified=0, pid=22, just started
22:X 07 Feb 2022 20:11:28.931 # Configuration loaded
22:X 07 Feb 2022 20:11:28.960 * monotonic clock: POSIX clock_gettime
22:X 07 Feb 2022 20:11:28.965 * Running mode=sentinel, port=26379.
22:X 07 Feb 2022 20:11:28.969 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:28.969 # Sentinel ID is 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74
22:X 07 Feb 2022 20:11:28.969 # +monitor master shard3 172.20.0.2 30662 quorum 2
22:X 07 Feb 2022 20:11:28.969 # +monitor master shard2 172.20.0.2 30660 quorum 2
22:X 07 Feb 2022 20:11:28.969 # +monitor master shard1 172.20.0.2 30658 quorum 2
22:X 07 Feb 2022 20:11:28.969 # +monitor master primary 172.20.0.2 30692 quorum 2
22:X 07 Feb 2022 20:11:28.973 * +slave slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:11:28.975 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:29.025 * +slave slave 172.20.0.2:30663 172.20.0.2 30663 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:11:29.027 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:29.406 * +sentinel sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:11:29.409 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:29.409 * +sentinel sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:11:29.410 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:29.410 * +sentinel sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:11:29.412 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:29.412 * +sentinel sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:11:29.414 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:29.505 * +sentinel sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:11:29.507 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:29.507 * +sentinel sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:11:29.509 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:29.640 * +sentinel sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:11:29.642 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:29.642 * +sentinel sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:11:29.644 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:39.013 * +slave slave 172.20.0.2:30659 172.20.0.2 30659 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:11:39.016 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:11:39.016 * +slave slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:11:39.019 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:11.269 # +sdown master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:11.325 # +odown master primary 172.20.0.2 30692 #quorum 2/2
22:X 07 Feb 2022 20:12:11.325 # +new-epoch 1
22:X 07 Feb 2022 20:12:11.325 # +try-failover master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:11.330 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:11.330 # +vote-for-leader 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 1
22:X 07 Feb 2022 20:12:11.335 # 33b6962d39b2aba5c849f76cdc83dcda59d06d94 voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 1
22:X 07 Feb 2022 20:12:11.431 # +elected-leader master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:11.431 # +failover-state-select-slave master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:11.498 # +selected-slave slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:11.498 * +failover-state-send-slaveof-noone slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:11.561 * +failover-state-wait-promotion slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:11.633 # +sdown master shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:11.696 # +odown master shard1 172.20.0.2 30658 #quorum 2/2
22:X 07 Feb 2022 20:12:11.696 # +new-epoch 2
22:X 07 Feb 2022 20:12:11.696 # +try-failover master shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:11.699 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:11.699 # +vote-for-leader 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 2
22:X 07 Feb 2022 20:12:11.704 # 33b6962d39b2aba5c849f76cdc83dcda59d06d94 voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 2
22:X 07 Feb 2022 20:12:11.790 # +elected-leader master shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:11.790 # +failover-state-select-slave master shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:11.857 # +selected-slave slave 172.20.0.2:30659 172.20.0.2 30659 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:11.857 * +failover-state-send-slaveof-noone slave 172.20.0.2:30659 172.20.0.2 30659 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:11.929 * +failover-state-wait-promotion slave 172.20.0.2:30659 172.20.0.2 30659 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:12.361 # +sdown master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:12.365 # 33b6962d39b2aba5c849f76cdc83dcda59d06d94 voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 2
22:X 07 Feb 2022 20:12:12.427 # +odown master shard2 172.20.0.2 30660 #quorum 2/2
22:X 07 Feb 2022 20:12:12.427 # +new-epoch 3
22:X 07 Feb 2022 20:12:12.427 # +try-failover master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:12.430 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:12.430 # +vote-for-leader 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 3
22:X 07 Feb 2022 20:12:12.433 # 33b6962d39b2aba5c849f76cdc83dcda59d06d94 voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 3
22:X 07 Feb 2022 20:12:12.488 # +elected-leader master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:12.488 # +failover-state-select-slave master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:12.492 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:12.492 # +promoted-slave slave 172.20.0.2:30659 172.20.0.2 30659 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:12.492 # +failover-state-reconf-slaves master shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:12.495 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:12.495 # +promoted-slave slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:12.495 # +failover-state-reconf-slaves master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:12.565 # +selected-slave slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:12.565 * +failover-state-send-slaveof-noone slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:12.565 # +failover-end master shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:12:12.565 # +failover-end master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:12:12.565 # +switch-master primary 172.20.0.2 30692 172.20.0.2 30693
22:X 07 Feb 2022 20:12:12.566 * +slave slave 172.20.0.2:30692 172.20.0.2 30692 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:12:12.568 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:12.645 * +failover-state-wait-promotion slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:12.646 # +switch-master shard1 172.20.0.2 30658 172.20.0.2 30659
22:X 07 Feb 2022 20:12:12.646 * +slave slave 172.20.0.2:30658 172.20.0.2 30658 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:12:12.648 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:13.223 # +sdown master shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:13.278 # +odown master shard3 172.20.0.2 30662 #quorum 2/2
22:X 07 Feb 2022 20:12:13.278 # +new-epoch 4
22:X 07 Feb 2022 20:12:13.278 # +try-failover master shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:13.281 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:13.281 # +vote-for-leader 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 4
22:X 07 Feb 2022 20:12:13.285 # 33b6962d39b2aba5c849f76cdc83dcda59d06d94 voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 4
22:X 07 Feb 2022 20:12:13.353 # +elected-leader master shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:13.353 # +failover-state-select-slave master shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:13.408 # +selected-slave slave 172.20.0.2:30663 172.20.0.2 30663 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:13.408 * +failover-state-send-slaveof-noone slave 172.20.0.2:30663 172.20.0.2 30663 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:13.461 * +failover-state-wait-promotion slave 172.20.0.2:30663 172.20.0.2 30663 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:13.465 # 33b6962d39b2aba5c849f76cdc83dcda59d06d94 voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 4
22:X 07 Feb 2022 20:12:13.542 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:13.542 # +promoted-slave slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:13.542 # +failover-state-reconf-slaves master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:13.616 # +failover-end master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:12:13.616 # +switch-master shard2 172.20.0.2 30660 172.20.0.2 30661
22:X 07 Feb 2022 20:12:13.616 * +slave slave 172.20.0.2:30660 172.20.0.2 30660 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:12:13.619 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:14.297 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:14.297 # +promoted-slave slave 172.20.0.2:30663 172.20.0.2 30663 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:14.297 # +failover-state-reconf-slaves master shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:14.352 # +failover-end master shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:12:14.352 # +switch-master shard3 172.20.0.2 30662 172.20.0.2 30663
22:X 07 Feb 2022 20:12:14.353 * +slave slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:12:14.354 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:12:14.445 # +sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:12:14.446 # +sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:12:14.446 # +sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:12:14.446 # +sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:12:17.573 # +sdown slave 172.20.0.2:30692 172.20.0.2 30692 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:12:17.693 # +sdown slave 172.20.0.2:30658 172.20.0.2 30658 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:12:18.663 # +sdown slave 172.20.0.2:30660 172.20.0.2 30660 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:12:19.357 # +sdown slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:13:12.093 # -sdown slave 172.20.0.2:30658 172.20.0.2 30658 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:13:13.154 # -sdown slave 172.20.0.2:30660 172.20.0.2 30660 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:13:14.103 # -sdown slave 172.20.0.2:30692 172.20.0.2 30692 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:13:17.170 # -sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:13:17.170 # -sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:13:17.170 # -sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:13:17.170 # -sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:13:17.899 # -sdown slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:04.575 # +sdown master primary 172.20.0.2 30693
22:X 07 Feb 2022 20:14:05.181 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:05.181 # +new-epoch 5
22:X 07 Feb 2022 20:14:05.183 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:05.183 # +vote-for-leader 88b5d5327f31e14e0d52eafc8527cb18fae7589f 5
22:X 07 Feb 2022 20:14:05.449 # +sdown master shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:14:05.601 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:05.601 # +new-epoch 6
22:X 07 Feb 2022 20:14:05.603 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:05.603 # +vote-for-leader 88b5d5327f31e14e0d52eafc8527cb18fae7589f 6
22:X 07 Feb 2022 20:14:05.645 # +odown master primary 172.20.0.2 30693 #quorum 2/2
22:X 07 Feb 2022 20:14:05.645 # Next failover delay: I will not start a failover before Mon Feb  7 20:16:05 2022
22:X 07 Feb 2022 20:14:06.115 # +sdown master shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:14:06.251 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:06.251 # +new-epoch 7
22:X 07 Feb 2022 20:14:06.253 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:06.253 # +vote-for-leader 88b5d5327f31e14e0d52eafc8527cb18fae7589f 7
22:X 07 Feb 2022 20:14:06.256 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:06.256 # +vote-for-leader 88b5d5327f31e14e0d52eafc8527cb18fae7589f 7
22:X 07 Feb 2022 20:14:06.304 # +config-update-from sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:14:06.304 # +switch-master primary 172.20.0.2 30693 172.20.0.2 30692
22:X 07 Feb 2022 20:14:06.305 * +slave slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:14:06.307 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:06.561 # +odown master shard1 172.20.0.2 30659 #quorum 2/2
22:X 07 Feb 2022 20:14:06.561 # Next failover delay: I will not start a failover before Mon Feb  7 20:16:06 2022
22:X 07 Feb 2022 20:14:06.618 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:06.618 # +vote-for-leader 88b5d5327f31e14e0d52eafc8527cb18fae7589f 7
22:X 07 Feb 2022 20:14:06.618 # Next failover delay: I will not start a failover before Mon Feb  7 20:16:07 2022
22:X 07 Feb 2022 20:14:06.708 # +config-update-from sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:14:06.708 # +switch-master shard1 172.20.0.2 30659 172.20.0.2 30658
22:X 07 Feb 2022 20:14:06.708 * +slave slave 172.20.0.2:30659 172.20.0.2 30659 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:14:06.710 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:06.893 # +sdown master shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:06.955 # +odown master shard3 172.20.0.2 30663 #quorum 2/2
22:X 07 Feb 2022 20:14:06.955 # +new-epoch 8
22:X 07 Feb 2022 20:14:06.955 # +try-failover master shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:06.958 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:06.958 # +vote-for-leader 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 8
22:X 07 Feb 2022 20:14:06.961 # 88b5d5327f31e14e0d52eafc8527cb18fae7589f voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 8
22:X 07 Feb 2022 20:14:07.021 # +elected-leader master shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:07.021 # +failover-state-select-slave master shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:07.092 # +selected-slave slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:07.092 * +failover-state-send-slaveof-noone slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:07.148 * +failover-state-wait-promotion slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:07.211 # +odown master shard2 172.20.0.2 30661 #quorum 2/2
22:X 07 Feb 2022 20:14:07.211 # Next failover delay: I will not start a failover before Mon Feb  7 20:16:06 2022
22:X 07 Feb 2022 20:14:07.214 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:07.214 # +promoted-slave slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:07.214 # +failover-state-reconf-slaves master shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:07.283 # +failover-end master shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:14:07.283 # +switch-master shard3 172.20.0.2 30663 172.20.0.2 30662
22:X 07 Feb 2022 20:14:07.283 * +slave slave 172.20.0.2:30663 172.20.0.2 30663 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:14:07.286 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:07.324 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:07.324 # +vote-for-leader 88b5d5327f31e14e0d52eafc8527cb18fae7589f 8
22:X 07 Feb 2022 20:14:07.363 # Next failover delay: I will not start a failover before Mon Feb  7 20:16:07 2022
22:X 07 Feb 2022 20:14:07.455 # +config-update-from sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:14:07.455 # +switch-master shard2 172.20.0.2 30661 172.20.0.2 30660
22:X 07 Feb 2022 20:14:07.455 * +slave slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:14:07.457 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:14:07.629 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:14:07.629 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:14:07.629 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:14:07.629 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:14:11.344 # +sdown slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:14:11.729 # +sdown slave 172.20.0.2:30659 172.20.0.2 30659 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:14:12.307 # +sdown slave 172.20.0.2:30663 172.20.0.2 30663 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:14:12.462 # +sdown slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:03.364 # -sdown slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:04.835 # -sdown slave 172.20.0.2:30659 172.20.0.2 30659 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:15:07.077 # -sdown slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:07.379 # -sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:15:07.379 # -sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:07.379 # -sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:15:07.379 # -sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:14.473 # -sdown slave 172.20.0.2:30663 172.20.0.2 30663 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:15:42.139 # +sdown master shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:15:42.313 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:42.313 # +new-epoch 9
22:X 07 Feb 2022 20:15:42.315 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:42.315 # +vote-for-leader 88b5d5327f31e14e0d52eafc8527cb18fae7589f 9
22:X 07 Feb 2022 20:15:42.847 # +sdown master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:42.938 # +sdown master shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:15:42.938 # +odown master shard2 172.20.0.2 30660 #quorum 2/2
22:X 07 Feb 2022 20:15:42.938 # +new-epoch 10
22:X 07 Feb 2022 20:15:42.938 # +try-failover master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:42.941 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:42.941 # +vote-for-leader 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 10
22:X 07 Feb 2022 20:15:42.944 # 88b5d5327f31e14e0d52eafc8527cb18fae7589f voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 10
22:X 07 Feb 2022 20:15:42.945 # 33b6962d39b2aba5c849f76cdc83dcda59d06d94 voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 10
22:X 07 Feb 2022 20:15:43.000 # +elected-leader master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:43.000 # +failover-state-select-slave master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:43.059 # +selected-slave slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:43.059 * +failover-state-send-slaveof-noone slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:43.149 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:43.149 # +new-epoch 11
22:X 07 Feb 2022 20:15:43.150 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:43.150 # +vote-for-leader 88b5d5327f31e14e0d52eafc8527cb18fae7589f 11
22:X 07 Feb 2022 20:15:43.150 * +failover-state-wait-promotion slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:43.251 # +odown master shard1 172.20.0.2 30658 #quorum 2/2
22:X 07 Feb 2022 20:15:43.251 # Next failover delay: I will not start a failover before Mon Feb  7 20:17:42 2022
22:X 07 Feb 2022 20:15:43.355 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:43.355 # +promoted-slave slave 172.20.0.2:30661 172.20.0.2 30661 @ shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:43.355 # +failover-state-reconf-slaves master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:43.368 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:43.368 # +vote-for-leader 88b5d5327f31e14e0d52eafc8527cb18fae7589f 11
22:X 07 Feb 2022 20:15:43.445 # +config-update-from sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard1 172.20.0.2 30658
22:X 07 Feb 2022 20:15:43.445 # +switch-master shard1 172.20.0.2 30658 172.20.0.2 30659
22:X 07 Feb 2022 20:15:43.446 * +slave slave 172.20.0.2:30658 172.20.0.2 30658 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:15:43.448 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:43.453 # +failover-end master shard2 172.20.0.2 30660
22:X 07 Feb 2022 20:15:43.454 # +switch-master shard2 172.20.0.2 30660 172.20.0.2 30661
22:X 07 Feb 2022 20:15:43.454 * +slave slave 172.20.0.2:30660 172.20.0.2 30660 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:15:43.456 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:44.072 # +odown master shard3 172.20.0.2 30662 #quorum 2/2
22:X 07 Feb 2022 20:15:44.072 # Next failover delay: I will not start a failover before Mon Feb  7 20:17:43 2022
22:X 07 Feb 2022 20:15:44.250 # +config-update-from sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard3 172.20.0.2 30662
22:X 07 Feb 2022 20:15:44.250 # +switch-master shard3 172.20.0.2 30662 172.20.0.2 30663
22:X 07 Feb 2022 20:15:44.251 * +slave slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:15:44.254 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:51.703 # +sdown master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:51.794 # +odown master primary 172.20.0.2 30692 #quorum 2/2
22:X 07 Feb 2022 20:15:51.794 # +new-epoch 12
22:X 07 Feb 2022 20:15:51.794 # +try-failover master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:51.814 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:51.814 # +vote-for-leader 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 12
22:X 07 Feb 2022 20:15:51.820 # 88b5d5327f31e14e0d52eafc8527cb18fae7589f voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 12
22:X 07 Feb 2022 20:15:51.820 # 33b6962d39b2aba5c849f76cdc83dcda59d06d94 voted for 3427e74d08e85a3cb5861d0e160d7bc85a0ecd74 12
22:X 07 Feb 2022 20:15:51.866 # +elected-leader master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:51.866 # +failover-state-select-slave master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:51.938 # +selected-slave slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:51.938 * +failover-state-send-slaveof-noone slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:52.000 * +failover-state-wait-promotion slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:52.831 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:15:52.831 # +promoted-slave slave 172.20.0.2:30693 172.20.0.2 30693 @ primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:52.831 # +failover-state-reconf-slaves master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:52.905 # +failover-end master primary 172.20.0.2 30692
22:X 07 Feb 2022 20:15:52.905 # +switch-master primary 172.20.0.2 30692 172.20.0.2 30693
22:X 07 Feb 2022 20:15:52.906 * +slave slave 172.20.0.2:30692 172.20.0.2 30692 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:15:52.909 * Sentinel new configuration saved on disk
22:X 07 Feb 2022 20:16:43.692 * +fix-slave-config slave 172.20.0.2:30660 172.20.0.2 30660 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:21:15.670 # +sdown slave 172.20.0.2:30692 172.20.0.2 30692 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:21:16.549 # +sdown slave 172.20.0.2:30658 172.20.0.2 30658 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:21:17.236 # +sdown slave 172.20.0.2:30660 172.20.0.2 30660 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:21:17.988 # +sdown slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:21:18.792 # +sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:21:18.792 # +sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:21:18.792 # +sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:21:18.792 # +sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:22:09.655 * +reboot slave 172.20.0.2:30692 172.20.0.2 30692 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:22:09.721 # -sdown slave 172.20.0.2:30692 172.20.0.2 30692 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:22:11.342 * +reboot slave 172.20.0.2:30658 172.20.0.2 30658 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:22:11.395 # -sdown slave 172.20.0.2:30658 172.20.0.2 30658 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:22:18.225 * +reboot slave 172.20.0.2:30660 172.20.0.2 30660 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:22:18.302 # -sdown slave 172.20.0.2:30660 172.20.0.2 30660 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:22:18.302 * +reboot slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:22:18.355 # -sdown slave 172.20.0.2:30662 172.20.0.2 30662 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:22:21.161 # -sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:22:21.161 # -sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:22:21.161 # -sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:22:21.161 # -sdown sentinel 88b5d5327f31e14e0d52eafc8527cb18fae7589f 172.20.0.2 30664 @ primary 172.20.0.2 30693
22:X 07 Feb 2022 20:23:41.167 # +sdown master primary 172.20.0.2 30693
22:X 07 Feb 2022 20:23:42.027 # +sdown master shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:23:42.532 # +sdown master shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:23:43.301 # +sdown master shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:23:44.003 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard3 172.20.0.2 30663
22:X 07 Feb 2022 20:23:44.003 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard2 172.20.0.2 30661
22:X 07 Feb 2022 20:23:44.003 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ shard1 172.20.0.2 30659
22:X 07 Feb 2022 20:23:44.003 # +sdown sentinel 33b6962d39b2aba5c849f76cdc83dcda59d06d94 172.20.0.2 30665 @ primary 172.20.0.2 30693

Please let me know what are suggested next steps. Should I collect additional info or should I try to setup a kubernetes cluster which can be used to reproduce the issue and which I can share with you.

Comment From: moticless

@RomanKisilenko Looks to me like we encounter another issue. Before sentinels failed to elect the leader, the one that will select a promoted slave, and we got multiple -failover-abort-not-elected printouts . Based on new logs we do get +elected-leader as expected and never encounter -failover-abort-not-elected.

Some highlights from your logs:

Sentinel 1 ID 88b5(...) 20:22:20.077 Restarts running and load from configuration master 20:22:45.104 Subjectively identifies that , is down 20:23:44.021 Subjectively identifies that sentinel 3 is down

Sentinel 2 ID 3427(...) 20:15:43.445 +switch-master to (Sentinel 3 is the one to update) 20:21:18.792 Subjectively identifies that sentinel 1 is down 20:22:21.161 Subjectively identifies that sentinel 1 is up and running 20:23:44.003 Subjectively identifies that sentinel 3 is down

Sentinel 3 ID 33b6(...) ??

I suspect that sentinel 1 loads obsolete configuration file (shard1, 30658) whereas earlier in time (20:15:43.445) everybody agreed on port 30659... which should have lead to update of configuration file. Please verify that configuration file has write permission and persisted over restarts.
Can it be that sentinel 3 is down while sentinel 1 restarts?

Can you verify please these two bullets and collect once again all the logs before trying reproduce it on k8s. Please collect all the logs, from the beginning. Thank you

Comment From: RomanKisilenko

@moticless Hello, the sentinel 3 was up while sentinel 1 restarted. The sentinel 1 was down while sentinel 2 and 3 agreed on failover to 30659. Then Sentinel 1 restarted and loaded its last known configuration which was saved earlier when there was an agreement between sentinels to use 30658. Then sentinel 3 was shut down. Does it explain what is going on in the test? Do you think I should adjust the way I'm testing this issue or should I proceed reproducing it again and collecting the logs?

Comment From: moticless

Hi @RomanKisilenko, apparently your test sound reasonable (though I still not sure at what timing replicas goes down and up in the flow). Obviously timing is playing a role here as some of the tries do succeed.

It is better to supply entire logs in order to have better understanding. It is best if you manage to supply somehow the steps to reproduce it. Thank you

Comment From: FlorianVielhauer

Hi @moticless,

I am currently struggling with the same/similar issue as @RomanKisilenko:

REDIS Version: 6.2.7 (no difference with 7.0.5)
Docker Swarm Cluster
Node 1: Sentinel-1 + Master
Node 2: Sentinel-2 + Replica-1
Node 3: Sentinel-3 + Replica-2
Conf files of REDIS + Sentinel are persisted between container restarts (because Docker Swarm reuses the containers)
Tested with both IPs + Hostnames (no difference)

Steps to reproduce:

Deploy stack and wait that everything is up and running
Drain Node 1 (this will stop Sentinel-1 + Master)
Wait until failover to Replica-1 happened (this works fine and stable!)
Activate Node 1 and wait until Sentinel-1 and Master are up
Drain Node 2 (this will stop Sentinel-2 + Replica-1)

Now this happens: 1. Failover to Master is performed 2. Everything is stable for approx. 8 seconds - then suddenly the Master is demoted to replica 3. Master tries to sync with itself all the time and the whole system is unusable because there are only slaves and no master

This happens every time - always after approx. 8s. When looking at the code I found the following lines in sentinel.c in lines 2710ff:

            /* A slave turned into a master. We want to force our view and
             * reconfigure as slave. Wait some time after the change before
             * going forward, to receive new configs if any. */
            mstime_t wait_time = sentinel_publish_period*4;

            if (!(ri->flags & SRI_PROMOTED) &&
                 sentinelMasterLooksSane(ri->master) &&
                 sentinelRedisInstanceNoDownFor(ri,wait_time) &&
                 mstime() - ri->role_reported_time > wait_time)
            {
                int retval = sentinelSendSlaveOf(ri,ri->master->addr);
                if (retval == C_OK)
                    sentinelEvent(LL_NOTICE,"+convert-to-slave",ri,"%@");
            }

The wait_time is 8s as far as I can see. I also see in the log that +convert-to-slave is logged after master promotion:

redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:15.894 # +new-epoch 3
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:15.894 # +try-failover master mymaster 10.0.1.26 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:15.899 # +vote-for-leader 7cd73a1b60b73995f3316bb1b578da1d2d8cd9f8 3
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:15.907 # 513e0245bc169c1ac1b3b9a7f733b72478771302 voted for 7cd73a1b60b73995f3316bb1b578da1d2d8cd9f8 3
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:15.976 # +elected-leader master mymaster 10.0.1.26 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:15.976 # +failover-state-select-slave master mymaster 10.0.1.26 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:16.034 # +selected-slave slave 10.0.1.17:6379 10.0.1.28 6379 @ mymaster 10.0.1.26 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:16.034 * +failover-state-send-slaveof-noone slave 10.0.1.17:6379 10.0.1.28 6379 @ mymaster 10.0.1.26 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:16.098 * +failover-state-wait-promotion slave 10.0.1.17:6379 10.0.1.28 6379 @ mymaster 10.0.1.26 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:16.292 # +failover-state-reconf-slaves master mymaster 10.0.1.26 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:16.292 # +promoted-slave slave 10.0.1.17:6379 10.0.1.28 6379 @ mymaster 10.0.1.26 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:16.345 # +failover-end master mymaster 10.0.1.26 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:16.345 # +switch-master mymaster 10.0.1.26 6379 10.0.1.28 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:16.347 * +slave slave 10.0.1.26:6379 10.0.1.26 6379 @ mymaster 10.0.1.17 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:16.347 * +slave slave 10.0.1.28:6379 10.0.1.28 6379 @ mymaster 10.0.1.17 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:18.352 # +sdown slave 10.0.1.26:6379 10.0.1.26 6379 @ mymaster 10.0.1.17 6379
redis-test_redis-sentinel-1.1.9mvnk4gunawv@bocoa-dev-ubuntu-1    | 1:X 01 Nov 2022 10:05:26.452 * +convert-to-slave slave 10.0.1.28:6379 10.0.1.28 6379 @ mymaster 10.0.1.17 6379

Maybe this gives a hint what's going on? What is the Sentinel waiting for within these 8s what is not happening?

Please find attached full logs and a docker-compose.yml with my setup.

Hints: 1. Configure NODE1, NODE2, NODE3 in .env according to your environment 2. Optional: Build the image in poll-sentinels\ - it will simulate the client-side with a simple python script which polls the sentinels every second and tries to write a key into the current master - you can view the container log to see what's going on. I found it useful to see when the failovers are performed and when the issue appears - but it is optional 3. Start the stack: docker stack deploy -c <(docker-compose config) redis-test 4. Drain node 1: docker node <node1> --availability drain 5. Wait for failover and start node 1: docker node <node1> --availability active 6. Drain node 2: docker node <node2> --availability drain 7. Wait a few seconds for the issue to appear

redis-failover-test.zip

Comment From: RomanKisilenko

I can confirm that we are still experiencing this issue as well, however I did not have enough time so far to continue debugging it.

Comment From: FlorianVielhauer

After more debugging I finally found a solution which works for me.

I built a custom sentinel based on 6.2.7 with the following changes:

PR from @moticless https://github.com/redis/redis/pull/11419/commits/f53aa88dad8d866f4bbe6df651775732d2eeb259
Skip +convert-to-slave when ri addr and master addr are equal:

--- a/src/sentinel.c
+++ b/src/sentinel.c
@@ -2738,9 +2738,14 @@ void sentinelRefreshInstanceInfo(sentinelRedisInstance *ri, const char *info) {
                  sentinelRedisInstanceNoDownFor(ri,wait_time) &&
                  mstime() - ri->role_reported_time > wait_time)
             {
-                int retval = sentinelSendSlaveOf(ri,ri->master->addr);
-                if (retval == C_OK)
-                    sentinelEvent(LL_NOTICE,"+convert-to-slave",ri,"%@");
+                if(ri->addr != NULL && ri->master->addr != NULL && !sentinelAddrOrHostnameEqual(ri->addr, ri->master->addr)) {
+                    int retval = sentinelSendSlaveOf(ri,ri->master->addr);
+                    if (retval == C_OK)
+                        sentinelEvent(LL_NOTICE,"+convert-to-slave",ri,"%@");
+                }
+                else {
+                    serverLog(LL_WARNING,"skip convert-to-slave because master and slave addr are equal");
+                }
             }
         }
     }
````

Now the failover works fine and stable in all combinations. I tested with both hostname + ip annoucements.

To be honest I have no idea why the situation appears where the sentinel tries to make a slave with itself as master. I think it is related to dynamic IPs in the cluster environments but I am not sure. I also found a similar address check for the event `+fix-slave-config` so maybe my patch makes sense?

It would be great to get some feedback and thoughts. We now have a solution which works for us - but of course it would be great if the patch could be merged into the codebase. Of course only if it makes sense and doesn't break anything else.

**Comment From: hwware**

@moticless Hi Moti, could you please take a look this thread? Thanks

**Comment From: MarBanW**

@moticless : Hi, any possibilities to check it deeper as I face same issue... 

**Comment From: moticless**

Hi, sorry for the late response. I will try to investigate it tomorrow. 

**Comment From: moticless**

Hi @FlorianVielhauer, 
For a start, I tried to use your code and it looks like the ip might change after node restart. Even if your sentinel  "resolve-hostnames yes" , it looks like "announce-hostnames no". sentinel is (most probably) not resliliant to dynamic ip. if I get it right, then, then it is also required you to set  "announce-hostnames yes" such to resolve each time the ip and not keep hardcoded IPs.

**Comment From: FlorianVielhauer**

Hi @moticless,

Thanks for investigation.

As I wrote in my first comment I "Tested with both IPs + Hostnames" and it made "no difference" (sorry for not beeing specific enough with regards to announce-hostnames setting). 

As a proof I tested once more with "announce-hostnames yes" and the behaviour is exactly the same as with ip announcements:

Approx 8-10s after failover the new master is always demoted to be a slave with itself as a master:

25 Nov 2022 13:08:03.075 # +promoted-slave slave redis-slave-2:6379 redis-slave-2 6379 @ mymaster redis-master-1 6379 ... 25 Nov 2022 13:08:13.245 * +convert-to-slave slave redis-slave-2:6379 redis-slave-2 6379 @ mymaster redis-slave-2 6379


This doesn't happen with my patch/custom build as I filtered out the +convert-to-slave by checking if slave+master are identical (please see my second post for details).

My main question is: Why does REDIS want to demote the promoted master to be a slave with itself as new master at all? 

I attached my setup with announce-hostnames=yes and logs.

[redis-failover-test-with-hostnames.zip](https://github.com/redis/redis/files/10092720/redis-failover-test-with-hostnames.zip)


**Comment From: moticless**

Hi @FlorianVielhauer,
Only based on you logs (redis-failover-test-with-hostnames.zip), it looks like sentinels still reports IP rather than hostnames:

+sentinel sentinel 0b0686fc1ad4545f42d40f1573075a9b05b218e4 10.0.5.3 26379 @ mymaster redis-master-1 6379 ``` We must take care that sentinels and replicas will use only hostnames, as long as static-IP is not possible.

Comment From: FlorianVielhauer

Hi @moticless,

I get your point, so I also configured sentinel announce-ip.

Now the sentinels announce their own address using hostnames, but unfortunately the issue remains. After few seconds the promoted master gets demoted to slave with itself as the new master.

Please also note that in my setup I only shut down one node to reproduce the issue. The services which were running on this node are NOT restarted on any of the other nodes so all the IPs are kind of static.

This is different to my original "steps to reproduce" because with announce-hostnames yes the issue already appears after shutdown of node 1. This is 100% reproducible... so announce-hostnames made it even worse?

I still don't understand why the +convert-to-slave event appears...

redis-test_redis-sentinel-2.1.pw1q6hwcxlz3@bocoa-dev-ubuntu-2    | 1:X 27 Nov 2022 20:08:01.820 * +sentinel sentinel d80d28ac04ec22e956f08d5ca838d05e7bd0b340 redis-sentinel-1 26379 @ mymaster redis-master-1 6379
...
redis-test_redis-sentinel-3.1.3qpo69t9p2hg@bocoa-dev-ubuntu-3    | 1:X 27 Nov 2022 20:09:19.060 # +promoted-slave slave redis-slave-2:6379 redis-slave-2 6379 @ mymaster redis-master-1 6379
...
redis-test_redis-sentinel-2.1.pw1q6hwcxlz3@bocoa-dev-ubuntu-2    | 1:X 27 Nov 2022 20:09:29.194 * +convert-to-slave slave redis-slave-2:6379 redis-slave-2 6379 @ mymaster redis-slave-2 6379

Full logs and configuration attached.

redis-failover-test-with-hostnames-2.zip

Comment From: moticless

Hi @FlorianVielhauer , I tried to run the docker-swarm setup with few docker-machines on virtualbox. Without any luck so far to bringup the setup... I am looking again now at your logs sentinel-2.logs and I see the following lines:

+switch-master mymaster redis-master-1 6379 redis-slave-2 6379
+slave slave redis-slave-1:6379 redis-slave-1 6379 @ mymaster redis-slave-2 6379
+slave slave redis-slave-2:6379 redis-slave-2 6379 @ mymaster redis-slave-2 6379

The printouts of +slave can happen only at two places. Either when processing INFO response or on sentinelResetMasterAndChangeAddress. Those printout are because of the second option. Looks like the trouble starts there in which it aggregates slaves from the old master struct, except to the promoted one (makes continue):

    while((de = dictNext(di)) != NULL) {
        sentinelRedisInstance *slave = dictGetVal(de);

        if (sentinelAddrIsEqual(slave->addr,newaddr)) continue;
        slaves[numslaves++] = dupSentinelAddr(slave->addr);
    }

If I get it right, the function sentinelAddrIsEqual() does bad job if IPs are not static, which leads to see redis-slave-2 recorded as +slave of redis-slave-2.

Can you try please replace all occurences of sentinelAddrIsEqual() with sentinelAddrOrHostnameEqual and see if it fixes your problem?

Comment From: FlorianVielhauer

Hi @moticless,

Can you try please replace all occurences of sentinelAddrIsEqual() with sentinelAddrOrHostnameEqual and see if it fixes your problem?

Yes, it fixes the issue for me. Thank you very much. 👍

Final two questions from my side:

When could this patch be integrated into an official release?
Off-Topic: Although I configured failover-timeout=2000 and down-after-milliseconds=2000 the failover from master to slave takes approx 22s. This is much longer than I had expected. Any tips/settings to speed this up?

Comment From: moticless

I will create PR that will eventually reach the next release and probably also backport.
Sentinel is not designed or optimized to make quick failovers and even configured intervals being factored as precaution. Note that behind the Sentinel algorithm which is lightweight algorithm there is also actual Redis server failover which can be a heavy task. I am not sure if it is a good idea to challenge with too low values, and if you do, then you have to test it and fine tune it carefully to your own environment. There are few potential places to save time but looking forward, I am not sure if it is in our priority.

Redis [BUG] Sentinel cluster is broken after a sentinel/redis restart

Sentinel

Sentinel

Sentinel