Describe the bug
We have 3 node cluster and during initial setup slots was properly distributed among cluster nodes,we observed that cluster nodes rebooted several times over the period of time and suspect this causes slot migration between nodes.
To reproduce
We suspect nodes rebooted and causes connection failure that might have triggered slot migration
Expected behavior
Redis slots should be static and bind to the respective node only.
Additional information
Cluster configuration are as follows,
cluster-enabled yes
cluster-config-file node_6379.conf
cluster-node-timeout 5000
cluster-require-full-coverage yes
Following are the redis logs from nodes,
22414:M 15 Oct 2022 01:21:30.385 # I have keys for slot 16377, but the slot is assigned to another node. Setting it to importing state.
22414:M 15 Oct 2022 01:21:30.385 # I have keys for slot 16379, but the slot is assigned to another node. Setting it to importing state.
22414:M 15 Oct 2022 01:21:30.385 # I have keys for slot 16380, but the slot is assigned to another node. Setting it to importing state.
22414:M 15 Oct 2022 01:21:30.385 # I have keys for slot 16381, but the slot is assigned to another node. Setting it to importing state.
22414:M 15 Oct 2022 01:22:03.958 * Clear FAIL state for node e6c154bb633cc461f488ee2544494bae86c62ceb: is reachable again and nobody is serving its slots after some time.
22414:M 25 Nov 2022 19:13:08.763 * Clear FAIL state for node e6c154bb633cc461f488ee2544494bae86c62ceb: master without slots is reachable again.
22414:M 25 Nov 2022 19:18:35.337 * Clear FAIL state for node 45906866d6f7aa305fd43554f03b01599fb3633b: is reachable again and nobody is serving its slots after some time.
22414:M 25 Nov 2022 19:22:35.640 * Clear FAIL state for node 45906866d6f7aa305fd43554f03b01599fb3633b: is reachable again and nobody is serving its slots after some time.
its slots after some time.
22414:M 25 Nov 2022 21:36:10.169 * Clear FAIL state for node 45906866d6f7aa305fd43554f03b01599fb3633b: is reachable again and nobody is serving its slots after some time.
22414:M 25 Nov 2022 21:39:17.749 * Clear FAIL state for node 45906866d6f7aa305fd43554f03b01599fb3633b: is reachable again and nobody is serving its slots after some time.
7604:M 01 Dec 2022 02:11:08.810 # I have keys for unassigned slot 5. Taking responsibility for it.
7604:M 01 Dec 2022 02:11:08.810 # I have keys for unassigned slot 14. Taking responsibility for it.
7604:M 01 Dec 2022 02:11:08.810 # I have keys for unassigned slot 16. Taking responsibility for it.
This particular incident has impacted application performance, do you see any issues on cluster setup and how do we overcome slot migration issue.
Any comments or suggestions would be appreciated
Thanks, Hyder