Describe the bug We have 1 master (redis-master), 4 slaves (redis-slave-1, ..., redis-slave-4) and 5 sentinels (redis-sentinel-1, ..., redis-sentinel-5) in our redis cluster (docker swarm). The sentinels and redis instances all work with hostnames instead of IP addresses. In normal situation everything works fine. When the master gets down, one of the slaves becomes master but the new master tries to connect to itself. We think the problem is "replicaof" parameter in redis.conf file. It seems the sentinels add "replicaof " to the newly selected master. This problem makes the whole cluster faulty and the cluster doesn't work at all.

For example, when redis-master goes down, redis-slave-2 gets selected as the new master. In this case, redis.conf of redis-slave-2 has a line "replicaof redis-slave-2 6383" (its own hostname and port).

After shutting redis-master down Sentinel-1 logs:

19:X 19 Dec 2022 09:52:03.221 # Failed to resolve hostname 'redis-master'
19:X 19 Dec 2022 09:52:03.224 * Sentinel new configuration saved on disk
19:X 19 Dec 2022 09:52:03.224 # +new-epoch 1
19:X 19 Dec 2022 09:52:03.227 * Sentinel new configuration saved on disk
19:X 19 Dec 2022 09:52:03.227 # +vote-for-leader e4b00ae12885c1107f1121db77c5b795ebd83f5b 1
19:X 19 Dec 2022 09:52:03.227 # +odown master mymaster redis-master 6381 #quorum 3/2
19:X 19 Dec 2022 09:52:03.227 # Next failover delay: I will not start a failover before Mon Dec 19 09:52:33 2022
19:X 19 Dec 2022 09:52:04.183 # Failed to resolve hostname 'redis-master'
19:X 19 Dec 2022 09:52:04.416 # +config-update-from sentinel e4b00ae12885c1107f1121db77c5b795ebd83f5b redis-sentinel-4 26384 @ mymaster redis-master 6381
19:X 19 Dec 2022 09:52:04.416 # +switch-master mymaster redis-master 6381 redis-slave-2 6383
19:X 19 Dec 2022 09:52:04.418 * +slave slave redis-slave-1:6382 redis-slave-1 6382 @ mymaster redis-slave-2 6383
19:X 19 Dec 2022 09:52:04.418 * +slave slave redis-slave-3:6384 redis-slave-3 6384 @ mymaster redis-slave-2 6383
19:X 19 Dec 2022 09:52:04.419 * +slave slave redis-slave-2:6383 redis-slave-2 6383 @ mymaster redis-slave-2 6383
19:X 19 Dec 2022 09:52:04.419 * +slave slave redis-slave-4:6385 redis-slave-4 6385 @ mymaster redis-slave-2 6383
19:X 19 Dec 2022 09:52:04.426 # Failed to resolve hostname 'redis-master'

redis-slave-2 (the new master) logs:

10:S 19 Dec 2022 09:52:01.813 # CONFIG REWRITE executed with success.
10:S 19 Dec 2022 09:52:01.816 * Non blocking connect for SYNC fired the event.
10:S 19 Dec 2022 09:52:01.816 * Master replied to PING, replication can continue...
10:S 19 Dec 2022 09:52:01.817 * Trying a partial resynchronization (request 4b8d917fe71aaa0b00116b2d2cdae19f65aea894:66726131).
10:S 19 Dec 2022 09:52:01.817 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
10:S 19 Dec 2022 09:52:01.970 * Connecting to MASTER redis-slave-2:6383
10:S 19 Dec 2022 09:52:01.971 * MASTER <-> REPLICA sync started
10:S 19 Dec 2022 09:52:01.971 * Non blocking connect for SYNC fired the event.
10:S 19 Dec 2022 09:52:01.972 * Master replied to PING, replication can continue...
10:S 19 Dec 2022 09:52:01.972 * Trying a partial resynchronization (request 4b8d917fe71aaa0b00116b2d2cdae19f65aea894:66726131).
10:S 19 Dec 2022 09:52:01.973 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
10:S 19 Dec 2022 09:52:02.994 * Connecting to MASTER redis-slave-2:6383
10:S 19 Dec 2022 09:52:02.995 * MASTER <-> REPLICA sync started
10:S 19 Dec 2022 09:52:02.996 * Non blocking connect for SYNC fired the event.
10:S 19 Dec 2022 09:52:02.997 * Master replied to PING, replication can continue...
10:S 19 Dec 2022 09:52:02.998 * Trying a partial resynchronization (request 4b8d917fe71aaa0b00116b2d2cdae19f65aea894:66726131).
10:S 19 Dec 2022 09:52:02.999 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
10:S 19 Dec 2022 09:52:04.020 * Connecting to MASTER redis-slave-2:6383
10:S 19 Dec 2022 09:52:04.021 * MASTER <-> REPLICA sync started

Additional information redis version: 7.0.7

Comment From: moticless

Hi @a-m-farahani , Already fixed on unstable. Please read this issue.

Comment From: a-m-farahani

Hi @a-m-farahani , Already fixed on unstable. Please read this issue.

Thank you. Any chance for merging this fix to the master branch in the next days?

Comment From: moticless

What do you mean "master branch"?

Comment From: a-m-farahani

What do you mean "master branch"?

I mean a stable version. a new minor version or patch or something...

Comment From: moticless

What do you mean "master branch"?

I mean a stable version. a new minor version or patch or something...

@oranagra can you give estimation please?

Comment From: koohestani

What do you mean "master branch"?

I mean a stable version. a new minor version or patch or something...

@oranagra can you give estimation please?

I think he means when will this change come in a release version?

Comment From: enjoy-binbin

11590

it look like we did not pick it into the last release (7.0.6 or 7.0.7, or 6.2), but it was marked with backport DONE seems to be an overlook, need @oranagra take a look with that

Comment From: oranagra

that's very odd. the ticket is marked for backport as done, and i don't know how it got to this state (sadly there's no auditing trail on these changes). it's a fairly new fix, so it should have made it to 7.0.7 and 6.2.8. i see it's missing from both commit log, release notes, and also the spreadsheet i used to orchestrate that campaign. the only explanation i can think of is that someone moved it to either "done" or "in progress" without me being aware of it.

Comment From: oranagra

anyway, unless this issue is severe enough to justify a release of it's own, it'll have to wait till the next batch, which can take a few months (depending on what else we'll find that justifies a release)

Comment From: enjoy-binbin

Released in 7.0.8