Describe the bug
CLUSTER INFO and CLUSTER NODES may return information indicating that a new cluster is up, ok, connected, and ready to use, even though it isn't.
This can lead to CLUSTERDOWN errors when trying to actually get/set keys in a cluster node, even though CLUSTER INFO and CLUSTER NODES indicate that the cluster is usable.
These errors stop occurring if you simply wait 3-5 seconds after creating the cluster, but there doesn't seem to be a reliable way to determine how long you have to wait, or a way to reliably determine if the cluster is usable.
To reproduce
- Spin up a cluster using a script (I used several Redis nodes running in Docker)
- Immediately after running the
redis-cli --cluster create ...command, runCLUSTER INFOand/orCLUSTER NODESin a loop until the desired output is received (cluster_state:okand all nodes having the statusconnected, respectively) - Immediately try to get/set keys in the cluster
Some keys may work, other keys may return a CLUSTERDOWN error. This behavior persists for anywhere from 1-5 seconds until the cluster is truly ready to be used. After that, it works perfectly fine.
Expected behavior
Using the cluster (e.g. getting/setting keys) works every time.
Additional information
I am using the go-redis client, and redis 6.2.5 via Docker.
Is there a reliable way to determine if the cluster is usable that I'm missing? The behavior isn't consistent and there isn't a set number of times that getting/setting a key will fail before the CLUSTERDOWN errors stop occurring.
Comment From: ianling
It appears as though you must connect to each node in the cluster and run CLUSTER INFO on each one, because the output can differ between them. In other words, one node might report that the cluster is OK, but another might not; they aren't synchronized.
If this behavior is expected, feel free to close.