Hello, everyone, currently in our testing environment, we have 3 sentinels managing 1000 master Redis container instances(running on three ARM 64 machine), each master instance has one corresponding slaves, we test the following failover scenario:

setting failover-timeout: 5 s Failing 100 masters at the same time, we experienced all of our masters stuck in failover process.

setting failover-timeout: 60 s Failing 100 masters at the same time, eventually all of our master instances has been failed over, however it took very long time(more than 30 mins).

setting failover-timeout: 120 s Failing 100 masters at the same time, , eventually all of our master instances has been failed over, however it took very long time(more than 1 hour).

Also we noticed the sentinel instance has been in tilt mode for long time for all of these three cases.

Does anyone have similar issue as we experienced? Also did we manage too many masters(more than 1000) for three sentinels which caused the sentinel overload? or maybe we need to add more sentinels in this case? thanks in advance if anyone can help...