Hello @yossigo , thank you for adding #8282 , this is the much awaited PR .
we have spent sometime testing that feature on redis:6.2-rc3 from dockerhub with the configuration suggested here #8282 (comment)
at the outset Sentinel vends out the host name of the shard master when we query SENTINEL get-master-addr-by-name mymaster, but after we trigger a failover , it started vending out IPs again, isnt the expectation is to return host name?
here is the yaml spec we used, if it helps you reproduce .
Redis
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
spec:
serviceName: redis
replicas: 5
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
initContainers:
- name: config
image: redis:6.2-rc3
command: [ "bash", "-c" ]
args:
- |
cp /tmp/redis/redis.conf /etc/redis/redis.conf
MASTER_FQDN=redis-0.redis.REDACTED
POD_FQDN=$(hostname -f)
echo "replica-announce-ip $POD_FQDN" >> /etc/redis/redis.conf
echo "replica-announce-port 6379" >> /etc/redis/redis.conf
if [ "$POD_FQDN" = "$MASTER_FQDN" ]; then
echo "this is master, not updating config..."
else
echo "updating replica redis.conf..."
echo "replicaof $MASTER_FQDN 6379" >> /etc/redis/redis.conf
fi
cat /etc/redis/redis.conf
volumeMounts:
- name: redis-config
mountPath: /etc/redis/
- name: config
mountPath: /tmp/redis/
containers:
- name: redis
image: redis:6.2-rc3
command: ["redis-server"]
args: ["/etc/redis/redis.conf"]
ports:
- containerPort: 6379
name: redis
volumeMounts:
- name: data
mountPath: /data
- name: redis-config
mountPath: /etc/redis/
volumes:
- name: redis-config
emptyDir: {}
- name: config
configMap:
name: redis-config
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: scaleio
resources:
requests:
storage: 100Mi
---
apiVersion: v1
kind: Service
metadata:
name: redis
spec:
clusterIP: None
ports:
- port: 6379
targetPort: 6379
name: redis
selector:
app: redis
Sentinel
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: sentinel
spec:
serviceName: sentinel
replicas: 3
selector:
matchLabels:
app: sentinel
template:
metadata:
labels:
app: sentinel
spec:
initContainers:
- name: config
image: redis:6.2-rc3
command: [ "sh", "-c" ]
args:
- |
REDIS_PASSWORD=testpassword
MASTER_FQDN=redis-0.redis.REDACTED
POD_FQDN=$(hostname -f)
echo "port 26379
protected-mode no
sentinel resolve-hostnames yes
sentinel announce-hostnames yes
sentinel announce-ip $POD_FQDN
sentinel announce-port 26379
sentinel monitor mymaster $MASTER_FQDN 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1
sentinel auth-pass mymaster $REDIS_PASSWORD
" > /etc/redis/sentinel.conf
cat /etc/redis/sentinel.conf
volumeMounts:
- name: redis-config
mountPath: /etc/redis/
containers:
- name: sentinel
image: redis:6.2-rc3
command: ["redis-server", "/etc/redis/sentinel.conf", "--sentinel"]
ports:
- containerPort: 26379
name: sentinel
volumeMounts:
- name: redis-config
mountPath: /etc/redis/
- name: data
mountPath: /data
volumes:
- name: redis-config
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: scaleio
resources:
requests:
storage: 100Mi
---
apiVersion: v1
kind: Service
metadata:
name: sentinel
spec:
clusterIP: None
ports:
- port: 26379
targetPort: 26379
name: sentinel
selector:
app: sentinel
Comment From: yossigo
@satheeshaGowda I tried to look at it quickly (didn't have a lot of bandwidth for it) and I don't manage to see this reproduce. According to your configuration, all instances (also masters) should have replica-announce-ip set to their hostname but can you confirm this is really the case?
Normally Sentinel should retain the hostnames it receives and use them, so if you're getting an IP I'd first suspect its origin is external.
It might also be useful to take a look at Sentinel logs and the output of INFO replication from all instances.
Comment From: satheeshaGowda
yes, as you can see in the yaml spec above replica-announce-ip was also set for redis master
Redis Master Initial Conf
# Generated by CONFIG REWRITE
replica-announce-ip redis-0.redis.REDACTED
replica-announce-port 6379
root@redis-0:/data#
Redis Replica Initial Conf
# Generated by CONFIG REWRITE
replica-announce-ip redis-1.redis.REDACTED
replica-announce-port 6379
replicaof redis-0.redis.REDACTED 6379
Sentinel initial Conf
root@sentinel-0:/data# cat /etc/redis/sentinel.conf
port 26379
protected-mode no
sentinel resolve-hostnames yes
sentinel announce-hostnames yes
sentinel announce-ip "sentinel-0.redis.REDACTED"
sentinel announce-port 26379
sentinel monitor mymaster redis-0.redis.REDACTED 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel auth-pass mymaster testpassword
# Generated by CONFIG REWRITE
user default on nopass ~* &* +@all
dir "/data"
sentinel myid 8d757184d740e8ffca7de4400d7b510861562980
sentinel resolve-hostnames yes
sentinel announce-hostnames yes
sentinel config-epoch mymaster 0
sentinel leader-epoch mymaster 0
sentinel current-epoch 0
sentinel resolve-hostnames yes
sentinel announce-hostnames yes
sentinel known-replica mymaster IP_OF_REDIS_1_HERE 6379
sentinel resolve-hostnames yes
sentinel announce-hostnames yes
sentinel known-sentinel mymaster sentinel-1.redis.REDACTED 26379 6c83d7d3d83253d6062d918c3296ebc57c86426b
sentinel resolve-hostnames yes
sentinel announce-hostnames yes
sentinel known-sentinel mymaster sentinel-2.redis.REDACTED 26379 31c32ff1f24e173f04b5f93801f627574df37d11
Redis Master logs
kubectl logs -f redis-0
1:C 18 Feb 2021 20:23:10.789 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 18 Feb 2021 20:23:10.789 # Redis version=6.1.242, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 18 Feb 2021 20:23:10.789 # Configuration loaded
1:M 18 Feb 2021 20:23:10.789 * monotonic clock: POSIX clock_gettime
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 6.1.242 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in standalone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
| `-._ `._ / _.-' | PID: 1
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
1:M 18 Feb 2021 20:23:10.790 # Server initialized
1:M 18 Feb 2021 20:23:10.792 * Reading RDB preamble from AOF file...
1:M 18 Feb 2021 20:23:10.792 * Loading RDB produced by version 6.0.10
1:M 18 Feb 2021 20:23:10.792 * RDB age 68910 seconds
1:M 18 Feb 2021 20:23:10.792 * RDB memory usage when created 1.79 Mb
1:M 18 Feb 2021 20:23:10.792 * RDB has an AOF tail
1:M 18 Feb 2021 20:23:10.792 * Reading the remaining AOF tail...
1:M 18 Feb 2021 20:23:10.792 * DB loaded from append only file: 0.001 seconds
1:M 18 Feb 2021 20:23:10.792 * Ready to accept connections
1:M 18 Feb 2021 20:23:23.585 * Replica IP_OF_REDIS_1_HERE:6379 asks for synchronization
1:M 18 Feb 2021 20:23:23.585 * Full resync requested by replica IP_OF_REDIS_1_HERE:6379
1:M 18 Feb 2021 20:23:23.585 * Replication backlog created, my new replication IDs are '2a1711e52d7ac391529ecd4788431300fa7e7db2' and '0000000000000000000000000000000000000000'
1:M 18 Feb 2021 20:23:23.585 * Starting BGSAVE for SYNC with target: disk
1:M 18 Feb 2021 20:23:23.586 * Background saving started by pid 11
11:C 18 Feb 2021 20:23:23.592 * DB saved on disk
11:C 18 Feb 2021 20:23:23.593 * RDB: 0 MB of memory used by copy-on-write
1:M 18 Feb 2021 20:23:23.616 * Background saving terminated with success
1:M 18 Feb 2021 20:23:23.616 * Synchronization with replica IP_OF_REDIS_1_HERE:6379 succeeded
Redis Replica logs
➜ ~ kubectl logs -f redis-1
1:C 18 Feb 2021 20:23:22.646 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 18 Feb 2021 20:23:22.646 # Redis version=6.1.242, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 18 Feb 2021 20:23:22.646 # Configuration loaded
1:S 18 Feb 2021 20:23:22.647 * monotonic clock: POSIX clock_gettime
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 6.1.242 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in standalone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
| `-._ `._ / _.-' | PID: 1
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
1:S 18 Feb 2021 20:23:22.648 # Server initialized
1:S 18 Feb 2021 20:23:22.650 * Reading RDB preamble from AOF file...
1:S 18 Feb 2021 20:23:22.650 * Loading RDB produced by version 6.0.10
1:S 18 Feb 2021 20:23:22.650 * RDB age 68752 seconds
1:S 18 Feb 2021 20:23:22.650 * RDB memory usage when created 1.79 Mb
1:S 18 Feb 2021 20:23:22.650 * RDB has an AOF tail
1:S 18 Feb 2021 20:23:22.650 * Reading the remaining AOF tail...
1:S 18 Feb 2021 20:23:22.650 * DB loaded from append only file: 0.001 seconds
1:S 18 Feb 2021 20:23:22.650 * Ready to accept connections
1:S 18 Feb 2021 20:23:23.552 * Connecting to MASTER redis-0.redis.REDACTED:6379
1:S 18 Feb 2021 20:23:23.555 * MASTER <-> REPLICA sync started
1:S 18 Feb 2021 20:23:23.580 * Non blocking connect for SYNC fired the event.
1:S 18 Feb 2021 20:23:23.581 * Master replied to PING, replication can continue...
1:S 18 Feb 2021 20:23:23.583 * (Non critical) Master does not understand REPLCONF ip-address: -ERR REPLCONF ip-address provided by replica instance is too long: 71 bytes
1:S 18 Feb 2021 20:23:23.583 * Partial resynchronization not possible (no cached master)
1:S 18 Feb 2021 20:23:23.585 * Full resync from master: 2a1711e52d7ac391529ecd4788431300fa7e7db2:0
1:S 18 Feb 2021 20:23:23.615 * MASTER <-> REPLICA sync: receiving 200 bytes from master to disk
1:S 18 Feb 2021 20:23:23.615 * MASTER <-> REPLICA sync: Flushing old data
1:S 18 Feb 2021 20:23:23.616 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 18 Feb 2021 20:23:23.621 * Loading RDB produced by version 6.1.242
1:S 18 Feb 2021 20:23:23.621 * RDB age 0 seconds
1:S 18 Feb 2021 20:23:23.621 * RDB memory usage when created 1.83 Mb
1:S 18 Feb 2021 20:23:23.621 * MASTER <-> REPLICA sync: Finished with success
1:S 18 Feb 2021 20:23:23.621 * Background append only file rewriting started by pid 12
1:S 18 Feb 2021 20:23:23.649 * AOF rewrite child asks to stop sending diffs.
12:C 18 Feb 2021 20:23:23.649 * Parent agreed to stop sending diffs. Finalizing AOF...
12:C 18 Feb 2021 20:23:23.649 * Concatenating 0.00 MB of AOF diff received from parent.
12:C 18 Feb 2021 20:23:23.649 * SYNC append only file rewrite performed
12:C 18 Feb 2021 20:23:23.649 * AOF rewrite: 0 MB of memory used by copy-on-write
1:S 18 Feb 2021 20:23:23.655 * Background AOF rewrite terminated with success
1:S 18 Feb 2021 20:23:23.655 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 18 Feb 2021 20:23:23.655 * Background AOF rewrite finished successfully
Sentinel logs
➜ ~ kubectl logs -f sentinel-0
1:X 18 Feb 2021 20:23:27.147 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:X 18 Feb 2021 20:23:27.147 # Redis version=6.1.242, bits=64, commit=00000000, modified=0, pid=1, just started
1:X 18 Feb 2021 20:23:27.147 # Configuration loaded
1:X 18 Feb 2021 20:23:27.148 * monotonic clock: POSIX clock_gettime
1:X 18 Feb 2021 20:23:27.149 * Running mode=sentinel, port=26379.
1:X 18 Feb 2021 20:23:27.151 # Sentinel ID is 8d757184d740e8ffca7de4400d7b510861562980
1:X 18 Feb 2021 20:23:27.151 # +monitor master mymaster redis-0.redis.REDACTED 6379 quorum 2
1:X 18 Feb 2021 20:23:27.165 * +slave slave IP_OF_REDIS_1_HERE:6379 IP_OF_REDIS_1_HERE 6379 @ mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:23:38.468 * +sentinel sentinel 6c83d7d3d83253d6062d918c3296ebc57c86426b sentinel-1.redis.REDACTED 26379 @ mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:23:57.263 * +sentinel sentinel 31c32ff1f24e173f04b5f93801f627574df37d11 sentinel-2.redis.REDACTED 26379 @ mymaster redis-0.redis.REDACTED 6379
Sentinel Info sentinel
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=redis-0.redis.REDACTED:6379,slaves=1,sentinels=3
Redis Master Info replication
127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=IP_OF_REDIS_1_HERE,port=6379,state=online,offset=667975,lag=1
master_failover_state:no-failover
master_replid:2a1711e52d7ac391529ecd4788431300fa7e7db2
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:667975
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:667975
Redis Replica Info replication
127.0.0.1:6379> info replication
# Replication
role:slave
master_host:redis-0.redis.REDACTED
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:699517
slave_priority:100
slave_read_only:1
connected_slaves:0
master_failover_state:no-failover
master_replid:2a1711e52d7ac391529ecd4788431300fa7e7db2
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:699517
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:699517
Query Sentinel for master
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
1) "redis-0.redis.REDACTED"
2) "6379"
Query Sentinel for replicas
127.0.0.1:26379> sentinel slaves mymaster
1) 1) "name"
2) "IP_OF_REDIS_1_HERE:6379"
3) "ip"
4) "IP_OF_REDIS_1_HERE"
5) "port"
6) "6379"
7) "runid"
8) "9ceb7c4511021b347edccbc52a8e9b42ba5f72a7"
9) "flags"
10) "slave"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "931"
19) "last-ping-reply"
20) "931"
21) "down-after-milliseconds"
22) "5000"
23) "info-refresh"
24) "3760"
25) "role-reported"
26) "slave"
27) "role-reported-time"
28) "1900736"
29) "master-link-down-time"
30) "0"
31) "master-link-status"
32) "ok"
33) "master-host"
34) "redis-0.redis.REDACTED"
35) "master-port"
36) "6379"
37) "slave-priority"
38) "100"
39) "slave-repl-offset"
40) "734477"
#### Now trigger failover on sentinel
127.0.0.1:26379> sentinel failover mymaster
OK
Sentinel logs post failover
1:X 18 Feb 2021 20:56:42.294 # Executing user requested FAILOVER of 'mymaster'
1:X 18 Feb 2021 20:56:42.294 # +new-epoch 1
1:X 18 Feb 2021 20:56:42.294 # +try-failover master mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:56:42.341 # +vote-for-leader 8d757184d740e8ffca7de4400d7b510861562980 1
1:X 18 Feb 2021 20:56:42.341 # +elected-leader master mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:56:42.341 # +failover-state-select-slave master mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:56:42.396 # +selected-slave slave IP_OF_REDIS_1_HERE:6379 IP_OF_REDIS_1_HERE 6379 @ mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:56:42.397 * +failover-state-send-slaveof-noone slave IP_OF_REDIS_1_HERE:6379 IP_OF_REDIS_1_HERE 6379 @ mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:56:42.487 * +failover-state-wait-promotion slave IP_OF_REDIS_1_HERE:6379 IP_OF_REDIS_1_HERE 6379 @ mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:56:43.407 # +promoted-slave slave IP_OF_REDIS_1_HERE:6379 IP_OF_REDIS_1_HERE 6379 @ mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:56:43.407 # +failover-state-reconf-slaves master mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:56:43.480 # +failover-end master mymaster redis-0.redis.REDACTED 6379
1:X 18 Feb 2021 20:56:43.480 # +switch-master mymaster redis-0.redis.REDACTED 6379 IP_OF_REDIS_1_HERE 6379
1:X 18 Feb 2021 20:56:43.482 * +slave slave IP_OF_REDIS_O_HERE:6379 redis-0.redis.REDACTED 6379 @ mymaster IP_OF_REDIS_1_HERE 6379
Redis new master logs post failover
=========================
1:M 18 Feb 2021 20:56:42.487 # Connection with master lost.
1:M 18 Feb 2021 20:56:42.487 * Caching the disconnected master state.
1:M 18 Feb 2021 20:56:42.487 * Discarding previously cached master state.
1:M 18 Feb 2021 20:56:42.488 # Setting secondary replication ID to 2a1711e52d7ac391529ecd4788431300fa7e7db2, valid up to offset: 772898. New replication ID is e1e54c0c0217f9855c06ce9e15a1038ddf532332
1:M 18 Feb 2021 20:56:42.488 * MASTER MODE enabled (user request from 'id=4 addr=IP_OF_SENTINEL_0_HERE:38476 laddr=IP_OF_REDIS_1_HERE:6379 fd=7 name=sentinel-8d757184-cmd age=1995 idle=0 flags=x db=0 sub=0 psub=0 multi=4 qbuf=188 qbuf-free=40766 argv-mem=4 obl=45 oll=0 omem=0 tot-mem=61468 events=r cmd=exec user=default redir=-1')
1:M 18 Feb 2021 20:56:42.490 # CONFIG REWRITE executed with success.
1:M 18 Feb 2021 20:56:53.522 * Replica IP_OF_REDIS_O_HERE:6379 asks for synchronization
1:M 18 Feb 2021 20:56:53.523 * Partial resynchronization not accepted: Requested offset for second ID was 775646, but I can reply up to 772898
1:M 18 Feb 2021 20:56:53.523 * Starting BGSAVE for SYNC with target: disk
1:M 18 Feb 2021 20:56:53.523 * Background saving started by pid 23
23:C 18 Feb 2021 20:56:53.528 * DB saved on disk
23:C 18 Feb 2021 20:56:53.528 * RDB: 0 MB of memory used by copy-on-write
1:M 18 Feb 2021 20:56:53.550 * Background saving terminated with success
1:M 18 Feb 2021 20:56:53.550 * Synchronization with replica IP_OF_REDIS_O_HERE:6379 succeeded
Redis new replica (old master) log post failover
===========================
1:M 18 Feb 2021 20:56:42.489 # Connection with replica IP_OF_REDIS_1_HERE:6379 lost.
1:S 18 Feb 2021 20:56:53.518 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1:S 18 Feb 2021 20:56:53.518 * Connecting to MASTER IP_OF_REDIS_1_HERE:6379
1:S 18 Feb 2021 20:56:53.519 * MASTER <-> REPLICA sync started
1:S 18 Feb 2021 20:56:53.519 * REPLICAOF IP_OF_REDIS_1_HERE:6379 enabled (user request from 'id=11 addr=IP_OF_SENTINEL_1_HERE:45492 laddr=IP_OF_REDIS_O_HERE:6379 fd=8 name=sentinel-6c83d7d3-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=4 qbuf=202 qbuf-free=40752 argv-mem=4 obl=45 oll=0 omem=0 tot-mem=61468 events=r cmd=exec user=default redir=-1')
1:S 18 Feb 2021 20:56:53.521 # CONFIG REWRITE executed with success.
1:S 18 Feb 2021 20:56:53.523 * Non blocking connect for SYNC fired the event.
1:S 18 Feb 2021 20:56:53.523 * Master replied to PING, replication can continue...
1:S 18 Feb 2021 20:56:53.524 * (Non critical) Master does not understand REPLCONF ip-address: -ERR REPLCONF ip-address provided by replica instance is too long: 71 bytes
1:S 18 Feb 2021 20:56:53.524 * Trying a partial resynchronization (request 2a1711e52d7ac391529ecd4788431300fa7e7db2:775646).
1:S 18 Feb 2021 20:56:53.525 * Full resync from master: e1e54c0c0217f9855c06ce9e15a1038ddf532332:776025
1:S 18 Feb 2021 20:56:53.525 * Discarding previously cached master state.
1:S 18 Feb 2021 20:56:53.551 * MASTER <-> REPLICA sync: receiving 203 bytes from master to disk
1:S 18 Feb 2021 20:56:53.551 * MASTER <-> REPLICA sync: Flushing old data
1:S 18 Feb 2021 20:56:53.552 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 18 Feb 2021 20:56:53.554 * Loading RDB produced by version 6.1.242
1:S 18 Feb 2021 20:56:53.554 * RDB age 0 seconds
1:S 18 Feb 2021 20:56:53.554 * RDB memory usage when created 1.95 Mb
1:S 18 Feb 2021 20:56:53.554 * MASTER <-> REPLICA sync: Finished with success
1:S 18 Feb 2021 20:56:53.555 * Background append only file rewriting started by pid 20
1:S 18 Feb 2021 20:56:53.583 * AOF rewrite child asks to stop sending diffs.
20:C 18 Feb 2021 20:56:53.583 * Parent agreed to stop sending diffs. Finalizing AOF...
20:C 18 Feb 2021 20:56:53.583 * Concatenating 0.00 MB of AOF diff received from parent.
20:C 18 Feb 2021 20:56:53.584 * SYNC append only file rewrite performed
20:C 18 Feb 2021 20:56:53.584 * AOF rewrite: 0 MB of memory used by copy-on-write
1:S 18 Feb 2021 20:56:53.683 * Background AOF rewrite terminated with success
1:S 18 Feb 2021 20:56:53.683 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 18 Feb 2021 20:56:53.684 * Background AOF rewrite finished successfully
Now Query sentinel for shard master and slaves
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster 1) "IP_OF_REDIS_1_HERE" 2) "6379" 127.0.0.1:26379> sentinel slaves mymaster 1) 1) "name" 2) "IP_OF_REDIS_O_HERE:6379" 3) "ip" 4) "redis-0.redis.REDACTED" 5) "port" 6) "6379" 7) "runid" 8) "8721b6897e67d8cbd1b5e3a6678d209d959fa91b" 9) "flags" 10) "slave" 11) "link-pending-commands" 12) "0" 13) "link-refcount" 14) "1" 15) "last-ping-sent" 16) "0" 17) "last-ok-ping-reply" 18) "855" 19) "last-ping-reply" 20) "855" 21) "down-after-milliseconds" 22) "5000" 23) "info-refresh" 24) "5624" 25) "role-reported" 26) "slave" 27) "role-reported-time" 28) "166211" 29) "master-link-down-time" 30) "0" 31) "master-link-status" 32) "ok" 33) "master-host" 34) "IP_OF_REDIS_1_HERE" 35) "master-port" 36) "6379" 37) "slave-priority" 38) "100" 39) "slave-repl-offset" 40) "825331" 127.0.0.1:26379>
Comment From: satheeshaGowda
so as you can see sentinel 1. Vending Replicas IP (always before/after failover) 2. Vending Master IP post failover
noticed following log, may be its a bug or limitation on the length of the host name?
1:S 18 Feb 2021 20:23:23.583 * (Non critical) Master does not understand REPLCONF ip-address: -ERR REPLCONF ip-address provided by replica instance is too long: 71 bytes
Comment From: yossigo
@satheeshaGowda This seems to be the problem, the REPLCONF fails and as a result the replica remains identified by its IP (the default), effectively ignoring replica-announce-ip.
Comment From: satheeshaGowda
@yossigo according to RFC 1035 section 2.3.4. my DNS entry is valid it has only 71 characters in total.
also verified that according to Domain Name Validator my DNS entry is valid.
feels like its checking the IP lengh (holds 39 characters which is the maximum length of an ip6 address.) rather than DNS length, @yossigo something to confirm.
2.3.4. Size limits
Various objects and parameters in the DNS have size limits. They are
listed below. Some could be easily changed, others are more
fundamental.
labels 63 octets or less
names 255 octets or less
TTL positive values of a signed 32 bit number.
UDP messages 512 octets or less
Comment From: satheeshaGowda
looks like thats the case - https://github.com/redis/redis/blob/2dba1e391d3772a8da182d95bde050ffa9d01e4d/src/replication.c#L929
Comment From: yossigo
@satheeshaGowda yes, the assumption was clients will report their IP (or IPv6) and not a hostname. This will have to be fixed.