Describe the bug
get the following error in python
packages/rediscluster/pipeline.py", line 175, in send_cluster_commands raise ClusterDownError("CLUSTERDOWN error. Unable to rebuild the cluster") rediscluster.exceptions.ClusterDownError: CLUSTERDOWN error. Unable to rebuild the cluster
and in php
PHP Fatal error: Uncaught RedisClusterException: Error processing response from Redis node! in PHP Fatal error: Uncaught RedisException: socket error on read socket in
I am getting this every minutes for some reason, in php its like every 30 sec.
Got this in Redis Log
Failover auth denied to e5c5a860760bd8237f197c4b46a0dfdeed3361e8: its master is up 20428:M 19 Jul 2021 16:41:43.658 # Failover auth denied to a5bd685ec34d176f2eed2d27650e018dabe44ce5: its master is up 20428:S 19 Jul 2021 17:06:35.981 * FAIL message received from a5bd685ec34d176f2eed2d27650e018dabe44ce5 about 2e4e495592395b2a14ded6f6883af45d9a90b6dd 20428:S 19 Jul 2021 17:06:35.981 # Cluster state changed: fail 20428:S 19 Jul 2021 17:07:08.403 * Clear FAIL state for node 2e4e495592395b2a14ded6f6883af45d9a90b6dd: is reachable again and nobody is serving its slots after some time.
I am using Redis Cluster 5.0.8
3 Master and 3 Slaves
->Tried upgrading to latest phpredis -> Tried Disabling BG Save
Comment From: madolson
Cluster down indicates that there isn't a quorum of primaries available to make decisions. Also note that this isn't a help section, but reserved for reporting bugs. You should run cluster info to understand the state of the cluster.
You can also go here to find more help about debugging issues.
Comment From: astar10239
Hey,
I did cluster info, all clusters are online including master/slave
Still getting the error
Also i get this error from PHP
Timed out attempting to find data in the correct node\
though my timeouts are 10 sec for read/write
Comment From: madolson
Failover auth denied to e5c5a860760bd8237f197c4b46a0dfdeed3361e8: its master is up 20428:M 19 Jul 2021 16:41:43.658 # Failover auth denied to a5bd685ec34d176f2eed2d27650e018dabe44ce5: its master is up 20428:S 19 Jul 2021 17:06:35.981 * FAIL message received from a5bd685ec34d176f2eed2d27650e018dabe44ce5 about 2e4e495592395b2a14ded6f6883af45d9a90b6dd 20428:S 19 Jul 2021 17:06:35.981 # Cluster state changed: fail 20428:S 19 Jul 2021 17:07:08.403 * Clear FAIL state for node 2e4e495592395b2a14ded6f6883af45d9a90b6dd: is reachable again and nobody is serving its slots after some time.
The cluster state changed to fail means that the cluster failed, and you will be getting the error described. You should look to see what is going at that time during the cluster to understand what is going on.