Redis [BUG] Failover not happening after enabling announce-hostnames & resolve-hostnames in sentinel

Failover not happening once master gets restarted

After enabling announce-hostnames & resolve-hostnames at first replica formation is happening without any issues. But after master failover sentinels are not starting failover and I don't see +odown in sentinel logs which I see when running without hostnames.

To reproduce Redis 6.2.4 on 3 pods along with sentinels on each with the below configurations.

master.conf

replica-announce-ip  <DNS>

slave.conf

replica-announce-ip  <SLAVE DNS>
replicaof <MASTER-DNS> <MASTER-PORT>
 ```
**sentinel.conf**

SENTINEL resolve-hostnames yes SENTINEL announce-hostnames yes sentinel parallel-syncs mymaster 1 replica-announce-ip

Delete master pod and force it to not start again, see failover behavior and delete new master, and observe the same.

**Expected behavior**
Replica should become master when master went down.

**Additional information**
Below are the entries in logs when hostnames enabled where i only see ```+sdown```.

18:15:42.038 # +sdown master mymaster redis-0.redis.dev-wb.svc.cluster.local 6379 18:15:42.038 # +sdown sentinel eef584c8262dd5eaa3d8850e6932bfde80839a24 172.1.0.171 26379 @ mymaster redis-0.redis.dev-wb.svc.cluster.local 6379


Below is the log when hostnames were set to false, where I can see ```+odown```.

+sdown sentinel 9f8616a7f32bac48b49be67c2fce7039c0d16916 172.31.16.74 26379 @ RedisMaster 172.31.16.74 6379 +sdown master RedisMaster 172.31.16.74 6379 +odown master RedisMaster 172.31.16.74 6379 #quorum 2/2 +new-epoch 3



**Comment From: hwware**

@sivanagireddyb 
Hi I try to reproduce your problems in the 6.2.4 version, but even I set sentinel resolve-hostnames is yes and  sentinel announce-hostnames is yes, I could get the "odown" information in the logs and Replica should become master.   Following is my test profile, test steps and output logs,  Let me know you have any questions, thanks

**_Part 1.  Test Profile -- I have 1 master redis instance, 2 repli instances and 3 sentinels_**

**_master.conf:_**

replica-announce-ip testHost
port 6380
protected-mode no

**_replica1.conf_**

replica-announce-ip testHost
port 6381
protected-mode no
replicaof testHost 6380

**_replica2.conf_**

replica-announce-ip testHost
port 6382
protected-mode no
replicaof testHost 6380

**_sentinel1.conf:_**

sentinel monitor mymaster testHost 6380 2
sentinel down-after-milliseconds mymaster 6000
sentinel failover-timeout mymaster 18000
replica-announce-ip testHost
port 26380
sentinel resolve-hostnames yes
sentinel announce-hostnames yes

**_sentinel2.conf:_**

sentinel monitor mymaster testHost 6380 2
sentinel down-after-milliseconds mymaster 6000
sentinel failover-timeout mymaster 18000
replica-announce-ip testHost
port 26381
sentinel resolve-hostnames yes
sentinel announce-hostnames yes

**_sentinel3.conf:_**

sentinel monitor mymaster testHost 6380 2
sentinel down-after-milliseconds mymaster 6000
sentinel failover-timeout mymaster 18000
replica-announce-ip testHost
port 26382
sentinel resolve-hostnames yes
sentinel announce-hostnames yes

**_Part 2.  Test Step_**

**_after all master, replicas, and sentinels startup, first shutdown sentinel1, and wait for a few seconds,
  shutdown master, and never startup again_**

**_Part 3 Output logs:_**

**_Sentinel 2 log:_**

**_3271:X 14 Jul 2021 12:02:33.033 # +sdown master mymaster testHost 6380_**
**_3271:X 14 Jul 2021 12:02:33.107 # +odown master mymaster testHost 6380 #quorum 2/2_**
3271:X 14 Jul 2021 12:02:33.107 # +new-epoch 1
3271:X 14 Jul 2021 12:02:33.107 # +try-failover master mymaster testHost 6380
3271:X 14 Jul 2021 12:02:33.116 # +vote-for-leader d82a59bbe49eb5b98607f9c3c56eaab8d137d1c8 1
3271:X 14 Jul 2021 12:02:33.131 # ee04e963aa001bf423055f20af65ad3630d8c3e7 voted for d82a59bbe49eb5b98607f9c3c56eaab8d137d1c8 1
3271:X 14 Jul 2021 12:02:33.188 # +elected-leader master mymaster testHost 6380
3271:X 14 Jul 2021 12:02:33.188 # +failover-state-select-slave master mymaster testHost 6380
3271:X 14 Jul 2021 12:02:33.264 # +selected-slave slave testHost:6381 testHost 6381 @ mymaster testHost 6380
3271:X 14 Jul 2021 12:02:33.264 * +failover-state-send-slaveof-noone slave testHost:6381 testHost 6381 @ mymaster testHost 6380
3271:X 14 Jul 2021 12:02:33.332 * +failover-state-wait-promotion slave testHost:6381 testHost 6381 @ mymaster testHost 6380
3271:X 14 Jul 2021 12:02:34.188 # +promoted-slave slave testHost:6381 testHost 6381 @ mymaster testHost 6380
3271:X 14 Jul 2021 12:02:34.189 # +failover-state-reconf-slaves master mymaster testHost 6380
3271:X 14 Jul 2021 12:02:34.243 * +slave-reconf-sent slave testHost:6382 testHost 6382 @ mymaster testHost 6380
3271:X 14 Jul 2021 12:02:35.196 * +slave-reconf-inprog slave testHost:6382 testHost 6382 @ mymaster testHost 6380
3271:X 14 Jul 2021 12:02:35.196 * +slave-reconf-done slave testHost:6382 testHost 6382 @ mymaster testHost 6380
3271:X 14 Jul 2021 12:02:35.283 # -odown master mymaster testHost 6380
3271:X 14 Jul 2021 12:02:35.283 # +failover-end master mymaster testHost 6380
**_3271:X 14 Jul 2021 12:02:35.283 # +switch-master mymaster testHost 6380 testHost 6381_**
3271:X 14 Jul 2021 12:02:35.283 * +slave slave testHost:6382 testHost 6382 @ mymaster testHost 6381
3271:X 14 Jul 2021 12:02:35.283 * +slave slave testHost:6380 testHost 6380 @ mymaster testHost 6381
3271:X 14 Jul 2021 12:02:41.315 # +sdown slave testHost:6380 testHost 6380 @ mymaster testHost 6381

**_Sentinel 3 log:_**

**_3276:X 14 Jul 2021 12:02:32.938 # +sdown master mymaster testHost 6380_**
3276:X 14 Jul 2021 12:02:33.124 # +new-epoch 1
3276:X 14 Jul 2021 12:02:33.131 # +vote-for-leader d82a59bbe49eb5b98607f9c3c56eaab8d137d1c8 1
**_3276:X 14 Jul 2021 12:02:34.075 # +odown master mymaster testHost 6380 #quorum 2/2_**
3276:X 14 Jul 2021 12:02:34.075 # Next failover delay: I will not start a failover before Wed Jul 14 12:03:09 2021
3276:X 14 Jul 2021 12:02:34.243 # +config-update-from sentinel d82a59bbe49eb5b98607f9c3c56eaab8d137d1c8 127.0.0.1 26381 @ mymaster testHost 6380
**_3276:X 14 Jul 2021 12:02:34.243 # +switch-master mymaster testHost 6380 testHost 6381_**
3276:X 14 Jul 2021 12:02:34.243 * +slave slave testHost:6382 testHost 6382 @ mymaster testHost 6381
3276:X 14 Jul 2021 12:02:34.243 * +slave slave testHost:6380 testHost 6380 @ mymaster testHost 6381
3276:X 14 Jul 2021 12:02:40.272 # +sdown slave testHost:6380 testHost 6380 @ mymaster testHost 6381

From the log, we can see The replica testHost:6381 is promoted to master.


**Comment From: SivaBu-kore**

@hwware use DNS/hostnames names instead of ips

**Comment From: hwware**

> @hwware use DNS/hostnames names instead of ips

Hi @SivaBu-kore  I update the config file, replace the ips with hostname, after shutdown one of the sentinels and master, replica still could be promoted to master.  Please verify my reproduced steps and let me know any concerns. 

Thanks

**Comment From: SivaBu-kore**

@hmware I have tested with 1 Master & 1 Replica. You are testing with 1 Master & 2 Replicas. I'm not sure it will be an issue. 

**Comment From: hwware**

@SivaBu-kore Following is my test result with 1 Master & 1 Replica,  it looks Replica still could be promoted to Master. 
@sivanagireddyb Please check the following log with 1 Master & 1 Replica

Please let me know if some test steps are not consistent with you, thanks

**_Part 1. Test Profile -- I have 1 master redis instance, 1 repli instances and 3 sentinels_**

**_master.conf:_**

replica-announce-ip testHost
port 6380
protected-mode no

**_replica1.conf_**

replica-announce-ip testHost
port 6381
protected-mode no
replicaof testHost 6380

**_sentinel1.conf:_**

sentinel monitor mymaster testHost 6380 2
sentinel down-after-milliseconds mymaster 6000
sentinel failover-timeout mymaster 18000
replica-announce-ip testHost
port 26380
sentinel resolve-hostnames yes
sentinel announce-hostnames yes

**_sentinel2.conf:_**

sentinel monitor mymaster testHost 6380 2
sentinel down-after-milliseconds mymaster 6000
sentinel failover-timeout mymaster 18000
replica-announce-ip testHost
port 26381
sentinel resolve-hostnames yes
sentinel announce-hostnames yes

**_sentinel3.conf:_**

sentinel monitor mymaster testHost 6380 2
sentinel down-after-milliseconds mymaster 6000
sentinel failover-timeout mymaster 18000
replica-announce-ip testHost
port 26382
sentinel resolve-hostnames yes
sentinel announce-hostnames yes

**Part 2. Test Step**

after all master, replicas, and sentinels startup, **_first shutdown sentinel1, and wait for a few seconds,
shutdown master_**, and never startup again

**Part 3 Output logs:**

**_Sentinel 2 log:_**

**_2495:X 15 Jul 2021 09:45:01.584 # +sdown master mymaster testHost 6380_**
2495:X 15 Jul 2021 09:45:01.695 # +new-epoch 1
2495:X 15 Jul 2021 09:45:01.701 # +vote-for-leader 72c18be5a45f6a5fed5bc9d1e1ab6067510ad4a4 1
**_2495:X 15 Jul 2021 09:45:02.724 # +odown master mymaster testHost 6380 #quorum 2/2_**
2495:X 15 Jul 2021 09:45:02.724 # Next failover delay: I will not start a failover before Thu Jul 15 09:45:38 2021
2495:X 15 Jul 2021 09:45:02.952 # +config-update-from sentinel 72c18be5a45f6a5fed5bc9d1e1ab6067510ad4a4 127.0.0.1 26382 @ mymaster testHost 6380
2495:X 15 Jul 2021 09:45:02.952 # +switch-master mymaster testHost 6380 testHost 6381
2495:X 15 Jul 2021 09:45:02.953 * +slave slave testHost:6380 testHost 6380 @ mymaster testHost 6381
2495:X 15 Jul 2021 09:45:09.053 # +sdown slave testHost:6380 testHost 6380 @ mymaster testHost 6381

**_Sentinel 3 log:_**

**_2500:X 15 Jul 2021 09:45:01.588 # +sdown master mymaster testHost 6380_**
**_2500:X 15 Jul 2021 09:45:01.660 # +odown master mymaster testHost 6380 #quorum 2/2_**
2500:X 15 Jul 2021 09:45:01.660 # +new-epoch 1
2500:X 15 Jul 2021 09:45:01.660 # +try-failover master mymaster testHost 6380
2500:X 15 Jul 2021 09:45:01.687 # +vote-for-leader 72c18be5a45f6a5fed5bc9d1e1ab6067510ad4a4 1
2500:X 15 Jul 2021 09:45:01.701 # c6a6398e2873710278c4a5ea5ed14aea0fe98465 voted for 72c18be5a45f6a5fed5bc9d1e1ab6067510ad4a4 1
2500:X 15 Jul 2021 09:45:01.744 # +elected-leader master mymaster testHost 6380
2500:X 15 Jul 2021 09:45:01.744 # +failover-state-select-slave master mymaster testHost 6380
2500:X 15 Jul 2021 09:45:01.841 # +selected-slave slave testHost:6381 testHost 6381 @ mymaster testHost 6380
2500:X 15 Jul 2021 09:45:01.841 * +failover-state-send-slaveof-noone slave testHost:6381 testHost 6381 @ mymaster testHost 6380
2500:X 15 Jul 2021 09:45:01.908 * +failover-state-wait-promotion slave testHost:6381 testHost 6381 @ mymaster testHost 6380
2500:X 15 Jul 2021 09:45:02.883 # +promoted-slave slave testHost:6381 testHost 6381 @ mymaster testHost 6380
2500:X 15 Jul 2021 09:45:02.883 # +failover-state-reconf-slaves master mymaster testHost 6380
2500:X 15 Jul 2021 09:45:02.945 # +failover-end master mymaster testHost 6380
2500:X 15 Jul 2021 09:45:02.945 # +switch-master mymaster testHost 6380 testHost 6381
2500:X 15 Jul 2021 09:45:02.945 * +slave slave testHost:6380 testHost 6380 @ mymaster testHost 6381
2500:X 15 Jul 2021 09:45:08.979 # +sdown slave testHost:6380 testHost 6380 @ mymaster testHost 6381



**Comment From: thed0ct0r**

@hwware i am also running into this issue attempting to use sentinel in a docker swarm environment (where the internal DNS can resolve task names to addresses because ip's may change. running on an overlay network - where nat should not be an issue as all ports are open and mapped 1:1)

below is the simplest compose file which is basically a copy of your suggested configuration, and the issue reproduces both on a single machine, as well as on a cluster.

version: '3.8'

services: master: image: redis:alpine deploy: replicas: 1 restart_policy: delay: 5s networks: - redis-net hostname: master volumes: - type: volume source: master-data target: /data entrypoint: - "sh" - "-c" - | cat < /data/redis.conf replica-announce-ip tasks.master protected-mode no bind 0.0.0.0 EOF docker-entrypoint.sh redis-server /data/redis.conf --loglevel debug

replica: image: redis:alpine deploy: replicas: 1 restart_policy: delay: 5s networks: - redis-net hostname: replica volumes: - type: volume source: replica-data target: /data environment: - REPLICA_PRIORITY={{.Task.Slot}} - REPLICA_ADDR={{.Task.Name}} entrypoint: - "sh" - "-c" - | cat <> /data/redis.conf protected-mode no bind 0.0.0.0 replica-announce-ip $$REPLICA_ADDR replicaof tasks.master 6379 replica-priority $$REPLICA_PRIORITY EOF docker-entrypoint.sh redis-server /data/redis.conf --loglevel debug

sentinel: image: redis:alpine deploy: replicas: 3 restart_policy: delay: 5s networks: - redis-net hostname: sentinel-{{.Task.Slot}} environment: - SENTINEL_ADDR={{.Task.Name}} entrypoint: - "sh" - "-c" - | cat <> /data/sentinel.conf protected-mode no bind 0.0.0.0 sentinel announce-ip $$SENTINEL_ADDR sentinel monitor test-cluster tasks.master 6379 2 sentinel down-after-milliseconds test-cluster 6000 sentinel failover-timeout test-cluster 18000 sentinel resolve-hostnames yes sentinel announce-hostnames yes EOF docker-entrypoint.sh redis-sentinel /data/sentinel.conf --loglevel debug

networks: redis-net: driver: overlay attachable: true

volumes: master-data: name: 'redis_volume_{{.Service.Name}}{{.Task.Slot}}' replica-data: name: 'redis_volume'}}_{{.Task.Slot}


having sentinel announce-hostnames yes results in sentinels emitting the +sdown message for the master but never moving on to +odown and failover.
simply changing sentinel announce-hostnames to no in the compose file fixes this without any other change. 
this is a problem because it would need to pin the master's ip address and in case of a machine failure the master would never be re-added to the set (there is no record in the sentinel's config/history of the original master), not even as a replica.

you can run the compose file using `docker stack deploy -c sentinel-compose.yml test` with docker running in swarm mode

**Comment From: troyanov**

I'm having same issue and I think I was able to localise it.

---
Here is my investigation of I got when I running `alpine` based image on k8s as a StatefulSet

Check IP on `redis-0.redis-headless.default.svc.cluster.local` pod:
```bash
/data # ifconfig
eth0      Link encap:Ethernet 
              inet addr:10.244.0.161

Starting redis-sentinel pod and we can see it is connecting to redis-0 right away:

/data # redis-cli
127.0.0.1:6379> monitor
OK
1629506678.413582 [0 10.244.0.177:53103] "PING"
1629506678.413711 [0 10.244.0.177:53103] "INFO"
1629506678.413838 [0 10.244.0.177:42955] "SUBSCRIBE" "__sentinel__:hello"
1629506679.513769 [0 10.244.0.177:53103] "PING"
1629506680.463244 [0 10.244.0.177:53103] "PUBLISH" "__sentinel__:hello" "10.244.0.177,26379,e51cc7822a48f34ae85f8cf0faa8f0a75ff6a243,0,mymaster,redis-0.redis-headless.default.svc.cluster.local,6379,0"
1629506680.534972 [0 10.244.0.177:53103] "PING"
1629506681.536829 [0 10.244.0.177:53103] "PING"
1629506682.512930 [0 10.244.0.177:53103] "PUBLISH" "__sentinel__:hello" "10.244.0.177,26379,e51cc7822a48f34ae85f8cf0faa8f0a75ff6a243,0,mymaster,redis-0.redis-headless.default.svc.cluster.local,6379,0"
1629506682.579558 [0 10.244.0.177:53103] "PING"

Checking networking info for our redis-sentinel-0 pod:

❯ kubectl exec -it redis-sentinel-0 -- /bin/sh
Defaulted container "redis-sentinel" out of: redis-sentinel, config (init)
/data # netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:42955 redis-0.redis-headless.default.svc.cluster.local:redis ESTABLISHED
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:34497 redis-1.redis-headless.default.svc.cluster.local:redis ESTABLISHED
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:59095 redis-1.redis-headless.default.svc.cluster.local:redis TIME_WAIT
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:53103 redis-0.redis-headless.default.svc.cluster.local:redis ESTABLISHED
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:39105 redis-1.redis-headless.default.svc.cluster.local:redis ESTABLISHED
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:41515 redis-1.redis-headless.default.svc.cluster.local:redis TIME_WAIT
Active UNIX domain sockets (w/o servers)
Proto RefCnt Flags       Type       State         I-Node Path

Now we are killing redis-0 pod with kubectl delete pod redis-0 and checking it's new IP address:

/data # ifconfig
eth0      Link encap:Ethernet 
              inet addr:10.244.0.237

Now let's check networking info for our redis-sentinel-0 pod one more time:

❯ kubectl exec -it redis-sentinel-0 -- /bin/sh
Defaulted container "redis-sentinel" out of: redis-sentinel, config (init)
/data # netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:34497 redis-1.redis-headless.default.svc.cluster.local:redis ESTABLISHED
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:59095 redis-1.redis-headless.default.svc.cluster.local:redis TIME_WAIT
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:39105 redis-1.redis-headless.default.svc.cluster.local:redis ESTABLISHED
tcp        0      0 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:41515 redis-1.redis-headless.default.svc.cluster.local:redis TIME_WAIT
tcp        0      1 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:55851 10.244.0.161:redis      SYN_SENT
tcp        0      1 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:52127 10.244.0.161:redis      SYN_SENT
Active UNIX domain sockets (w/o servers)
Proto RefCnt Flags       Type       State         I-Node Path

Note those hanging SYN_SENT that still using 10.244.0.161 (which was initial IP of redis-0)

tcp        0      1 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:55851 10.244.0.161:redis      SYN_SENT
tcp        0      1 redis-sentinel-0.redis-sentinel.default.svc.cluster.local:52127 10.244.0.161:redis      SYN_SENT

At the end after killing pods several times it looks like this:

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      1 10.244.0.211:40581      10.244.0.201:6379       SYN_SENT    1/redis-sentinel 0.
tcp        0      1 10.244.0.211:41353      10.244.1.38:6379        SYN_SENT    1/redis-sentinel 0.
tcp        0      1 10.244.0.211:44925      10.244.0.201:6379       SYN_SENT    1/redis-sentinel 0.
tcp        0      0 10.244.0.211:26379      10.244.0.234:46771      ESTABLISHED 1/redis-sentinel 0.
tcp        0      0 10.244.0.211:26379      10.244.1.10:33485       ESTABLISHED 1/redis-sentinel 0.
tcp        0      0 10.244.0.211:33909      10.244.1.10:26379       ESTABLISHED 1/redis-sentinel 0.
tcp        0      0 10.244.0.211:54913      10.244.0.234:26379      ESTABLISHED 1/redis-sentinel 0.
tcp        0      1 10.244.0.211:36375      10.244.1.38:6379        SYN_SENT    1/redis-sentinel 0.

Comment From: troyanov

It looks like Sentinel is memorising IP after hostname lookup and keep trying to connect to that address.

Here 10.244.1.32 is IP address that used to be valid for redis-1 before that pod was restarted (after restart DNS name for a StatefulSet pod is the same, but IP address is a not preserved)

root@redis-sentinel-0:/data# netstat -etd Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State User Inode tcp 0 0 redis-sentinel-0.:48011 redis-0.redis-head:6379 ESTABLISHED root 64189289 tcp 0 0 redis-sentinel-0.:52896 151.101.18.132:80 TIME_WAIT root 0 tcp 0 0 redis-sentinel-0.:43725 10-244-1-81.redis:26379 ESTABLISHED root 64185624 tcp 0 0 redis-sentinel-0.:47791 redis-0.redis-head:6379 ESTABLISHED root 64189288 tcp 0 1 redis-sentinel-0.:44463 10.244.1.32:6379 SYN_SENT root 66661160 tcp 0 0 redis-sentinel-0.:54909 10-244-0-194.redi:26379 ESTABLISHED root 64184651 tcp 0 0 redis-sentinel-0.:26379 10-244-0-194.redi:55715 ESTABLISHED root 64184635 tcp 0 0 redis-sentinel-0.:26379 10-244-1-81.redis:54561 ESTABLISHED root 64184707 tcp 0 1 redis-sentinel-0.:36145 10.244.1.32:6379 SYN_SENT root 66661984 root@redis-sentinel-0:/data# tcpdump host 10.244.1.32 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 23:24:20.482923 IP redis-sentinel-0.redis-sentinel.default.svc.cluster.local.36937 > 10.244.1.32.6379: Flags [S], seq 689568883, win 64240, options [mss 1460,sackOK,TS val 567563557 ecr 0,nop,wscale 7], length 0 23:24:25.165632 IP redis-sentinel-0.redis-sentinel.default.svc.cluster.local.53155 > 10.244.1.32.6379: Flags [S], seq 135490147, win 64240, options [mss 1460,sackOK,TS val 567568239 ecr 0,nop,wscale 7], length 0 23:24:26.178933 IP redis-sentinel-0.redis-sentinel.default.svc.cluster.local.53155 > 10.244.1.32.6379: Flags [S], seq 135490147, win 64240, options [mss 1460,sackOK,TS val 567569253 ecr 0,nop,wscale 7], length 0 23:24:28.194907 IP redis-sentinel-0.redis-sentinel.default.svc.cluster.local.53155 > 10.244.1.32.6379: Flags [S], seq 135490147, win 64240, options [mss 1460,sackOK,TS val 567571269 ecr 0,nop,wscale 7], length 0 23:24:28.433471 IP redis-sentinel-0.redis-sentinel.default.svc.cluster.local.43915 > 10.244.1.32.6379: Flags [S], seq 4200356803, win 64240, options [mss 1460,sackOK,TS val 567571507 ecr 0,nop,wscale 7], length 0 23:24:29.442940 IP redis-sentinel-0.redis-sentinel.default.svc.cluster.local.43915 > 10.244.1.32.6379: Flags [S], seq 4200356803, win 64240, options [mss 1460,sackOK,TS val 567572517 ecr 0,nop,wscale 7], length 0 23:24:31.458932 IP redis-sentinel-0.redis-sentinel.default.svc.cluster.local.43915 > 10.244.1.32.6379: Flags [S], seq 4200356803, win 64240, options [mss 1460,sackOK,TS val 567574533 ecr 0,nop,wscale 7], length 0

Comment From: troyanov

I'm not familiar with Redis codebase, but it seems like it indeed stores ip:port after doing hostname lookup E.g. https://github.com/redis/redis/blob/63e2a6d212e9a30d9768b1d044348420e5b128c9/src/sentinel.c#L1427-L1437

Comment From: bkhuong

Also running into this issue, any resolution?

Comment From: amitgoyal14

I am also facing same issue . After using hostname in place of ip , failover is not happening . Can you please share any fix .

Comment From: amitgoyal14

@hwware why you have used same hostname 'testhost' for all replica ?

Comment From: moticless

Indeed, even though Sentinel using hostnames, it doesn't support dynamic IPs unfortunately. I will evaluate the required effort to support it. Thanks.

Comment From: moticless

Hope this PR will fix it. Thanks

Comment From: M-Kepler

I have solve this problem by changing image from redis:alpine to redis:latest and it work.