Redis Disable redis cluster auto-failover

Sometimes users don't want automatic failover in the redis cluster, they can manual failover by themselves using "CLUSTER FAILOVER FORCE" command. Especially when some commands(flushall, del bigkey etc) block the master, slave would change to master, but actually master was not down at all. I think it is better to add a config to decide whether to enable auto-failover for redis cluster. @antirez What do you think about this?

Comment From: antirez

Why don't just set a very high node-timeout? It's the same basically.

Comment From: deep011

I don't think setting node-timeout big enough is a good idea. The node-timeout sometimes keep the cluster healthy, for example when there is a connection issue even if the node is alive, node-timeout used to reconnect to the node. And users also need the node pfail and fail state, but just want user themselves or their own tools decide whether to failover or not. At this time, disable auto-failover maybe needed.

Comment From: deep011

And we can use this feature to prevent some slaves that located at other data center changed to new masters.

Comment From: antirez

You have a point actually... Keeping the issue open to evaluate it later and to see if other users join us here with more use cases.

Comment From: antirez

Btw... this is very similar to something we already planned. It's a feature for slaves that prevent that particular slave to play in the master promotion game. Indeed that can be used both as a way to support multi-DC setups, together with CLUSTER FAILOVER TAKEOVER, which we already have and allows promotion of a slave without majority, and also can be used in this use case.

Somewhat related to #2458, that is a more refined version of this where certain rules are enforced. However the option to prevent certain slaves to failover is trivial and can be added ASAP to provide at least a manual support to multi-DC setups.

cluster-slave-no-failover yes

Comment From: deep011

I add the cluster-slave-no-failover config in the pull request.

Comment From: itamarhaber

Seems related to #3069 as well

Comment From: kornrunner

Here's a potential use case:

I wish to build a cluster, say three masters with three slaves / replicas. Main purpose of the cluster would be to shard the data across multiple instances and to allow failover. All instances would be on separate (virtual) machines - "remote" to machines that will do some actual work based on stored data.

Application does a bunch of reads and a realtively low amount of writes. So, the idea is to run "local" (on the same machines where the app resides) redis instances- read-only slaves to the cluster. Those slaves shouldn't be in the master promotion game, they should just replicate data for faster reads and for lower network/bandwith usage.

I hope this is a sensible setup.

Comment From: max06

Big upvote, we need that to.

Use-case: We'll have a big redis-cluster with 100 Masters, 2 Slaves per Master, all configured without RDB/AOF-Backups. Then we'll have another slave for each master only used for backups. These backup-slaves must never be elected as master.

The weighting only works for sentinel clusters, would be great to have something like this for real clusters as well.

Comment From: asadali

+1 for this feature to be supported in redis-cluster

use-case: pure read-only slaves which cannot ever become master. I tried playing around with cluster-node-timeout and cluster-slave-validity-factor to achieve this but it soon got complicated. I am still trying to figure out a robust approach based on the two but a natively supported option will be better.

Comment From: zbdba

We need to disable auto failover too. We don't want to some slave play in the master promotion game.

Comment From: byoungminoh

We have created Redis Cluster 144 Masters , with 2 Slaves per Master for REAL Service. And now.. We don't want be promoted in DR-ZONE Slave. (Because DR-ZONE will be used only Disaster situation.)

So.. We want this option as soon as possible.

cluster-slave-no-failover yes

(I've tried.. But We can not handle this request with cluster-node-timeout and cluster-slave-validity-factor.)

Thanks.

Comment From: withings-sas

+1 for this feature to be supported in redis-cluster

That would be very helpful for us too for our multi-DC setup

Comment From: ahmed-sigmalux

Upvoting this feature request.

Comment From: sbhartia

We need this too. Ability to mark a few slaves not playing the game of slave promotion.

Comment From: debu99

can i just set quorum number larger than the real number of sentinel?

https://redis.io/topics/sentinel "In practical terms this means during failures Sentinel never starts a failover if the majority of Sentinel processes are unable to talk (aka no failover in the minority partition)."

Comment From: max06

@debu99 Sadly not, a cluster has no sentinels. Different kind of architecture.

Comment From: antirez

Hello, I just pushed the implementation of the cluster-slave-no-failover option in both unstable and 4.0 branches. The commit takes also code from the PR that was posted here by @deep011, but it was no longer possible to merge the code since the user deleted the branch: after 2 years I can understand that. However if you PR to Redis, keeping your PRs floating may be a good idea, here most slowness is not due to lack of interest but lack of development bandwidth. So I reworked the patch because there were a few things to change in my opinion, and credited @deep011 in the commit message. Please if somebody is willing to test the patch, that would be cool. However I wrote unit tests for the two main things included in this change, that is, no failover, and propagation of the status.

Comment From: aultokped

Hello, is there any update for this? really need this feature. @antirez @KFilipek

Comment From: KFilipek

Hello, is there any update for this? really need this feature. @antirez @KFilipek

I see on the post above that @antirez pushed his implementation, did you tried it?

Comment From: madolson

This was eventually implemented, as indicated by salvatore.