Redis Nodes.conf does not update IP address of a node when IP changes after restart

We’re deploying Redis cluster into docker containers and we’re using persistence storage to retain node.conf for all Redis nodes.

In a case when one of the docker containers dies, and a new container comes back up, we reattach nodes.conf from a volume to it so that It can join the currently running Redis cluster again automatically.

However, when the container comes back, it comes with a different IP address. And when we start new Redis server inside this container it does not update the IP address inside nodes.conf file for the only new generated container. But all other nodes have awareness of new IP address inside their nodes.conf. Overall, a node which went down - does not update its own IP address inside nodes.conf.

724e0c2e75c5b07668738e0c67b76e6ab85ea0ea <OLD_IP_ADDRESS>:6379@16379 myself,slave ec98d81a7d1c7292438b9a647c7ac1ce438dde12 0 1517444059272 5 connected

When I checked the status of a Redis cluster using Redis-cli info command, it shows cluster status ok.

Redis version: 4.x

Comment From: manwegit

I ran into this same problem. Quick googling did not find pure redis fix so my hack is to update nodes*.conf from redis permanent storage before starting redis-server.

Here's my script: /opt/scripts/redis/fix-redis-nodeip /PATH/TO/REDIS/nodes*.conf

#!/bin/bash


export REDIS_LOCAL_IPADDR=$(/sbin/ip -4 -o addr show dev eth0|sed -n 's/^.*inet //p'|cut -d/ -f1)

perl -i.oldip -nle 'my $localip=$ENV{"REDIS_LOCAL_IPADDR"}; my $l=$_; if (/myself/) {
   $l =~ s/^\w+\s+\K(?:\d+\.){3}(?:\d+):/$localip:/} print $l;
' "$@"

My permanent storage is /data/redis-storage/ so I my docker entryfile has this before redis-server:

## USE WITH EXTREME CARE. works for me with kubernetes served redis-cluster
find /data/redis-storage/ -type f -name "nodes*.conf" -print0 | \
  xargs -r -0 /opt/scripts/redis/fix-redis-nodeip

Comment From: patelpayal

we ended up doing the same as we're storing this in a volume, we update the IP of myself before redis server startup - if it does not match.

Comment From: riaan53

Running into the same issue using Redis v5 rc3. Updating the IP of myself before Kubernetes stateful app startup solves it nicely. But often I get into the following situation after a few failure simulations: My cluster checks looks ok but I see the master ip on one of the slaves are outdated.

I get the following errors in the logs (node 10.60.4.17):

Connecting to MASTER 10.60.4.15:6379
MASTER <-> SLAVE sync started
Error condition on socket for SYNC: No route to host

redis-cli info (node 10.60.4.17):

# Replication
role:slave
master_host:10.60.4.15
master_port:6379
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0

And then my cluster info says its ok

root@test-redis-cluster-0:/data# redis-cli --cluster check $POD_IP:6379
10.60.3.33:6379 (9a68f1ed...) -> 0 keys | 5461 slots | 1 slaves.
10.60.1.36:6379 (e8a909c2...) -> 0 keys | 5462 slots | 1 slaves.
10.60.4.19:6379 (88a75be8...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 0 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 10.60.4.17:6379)
S: 6a7ed2991be5df6ddbee08ee0a4d3712ec662c09 10.60.4.17:6379
   slots: (0 slots) slave
   replicates 88a75be8fcab8779b0cf14bdbc2aa0ba415c55fd
M: 9a68f1edae7ba9a35eebcc72c549900c92494d92 10.60.3.33:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 2e9ed08fb7106b26afd7973c0381ee999bfe9fd2 10.60.1.37:6379
   slots: (0 slots) slave
   replicates e8a909c27ced76c7fd7ebefdc07dc61d666ffda2
M: e8a909c27ced76c7fd7ebefdc07dc61d666ffda2 10.60.1.36:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 13d54f7adf308530f5f249515fa8fe44a6f89e93 10.60.3.36:6379
   slots: (0 slots) slave
   replicates 9a68f1edae7ba9a35eebcc72c549900c92494d92
M: 88a75be8fcab8779b0cf14bdbc2aa0ba415c55fd 10.60.4.19:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

Did you run into the same issues? If so any tips?

Thanks.

Comment From: mu1345

Running into the same issue using Redis v5 rc3. Updating the IP of myself before Kubernetes stateful app startup solves it nicely. But often I get into the following situation after a few failure simulations: My cluster checks looks ok but I see the master ip on one of the slaves are outdated.

I get the following errors in the logs (node 10.60.4.17):

Connecting to MASTER 10.60.4.15:6379 MASTER <-> SLAVE sync started Error condition on socket for SYNC: No route to host

redis-cli info (node 10.60.4.17):

```

Replication

role:slave master_host:10.60.4.15 master_port:6379 master_link_status:down master_last_io_seconds_ago:-1 master_sync_in_progress:0 ```

And then my cluster info says its ok

``` root@test-redis-cluster-0:/data# redis-cli --cluster check $POD_IP:6379 10.60.3.33:6379 (9a68f1ed...) -> 0 keys | 5461 slots | 1 slaves. 10.60.1.36:6379 (e8a909c2...) -> 0 keys | 5462 slots | 1 slaves. 10.60.4.19:6379 (88a75be8...) -> 0 keys | 5461 slots | 1 slaves. [OK] 0 keys in 3 masters. 0.00 keys per slot on average.

Performing Cluster Check (using node 10.60.4.17:6379) S: 6a7ed2991be5df6ddbee08ee0a4d3712ec662c09 10.60.4.17:6379 slots: (0 slots) slave replicates 88a75be8fcab8779b0cf14bdbc2aa0ba415c55fd M: 9a68f1edae7ba9a35eebcc72c549900c92494d92 10.60.3.33:6379 slots:[10923-16383] (5461 slots) master 1 additional replica(s) S: 2e9ed08fb7106b26afd7973c0381ee999bfe9fd2 10.60.1.37:6379 slots: (0 slots) slave replicates e8a909c27ced76c7fd7ebefdc07dc61d666ffda2 M: e8a909c27ced76c7fd7ebefdc07dc61d666ffda2 10.60.1.36:6379 slots:[5461-10922] (5462 slots) master 1 additional replica(s) S: 13d54f7adf308530f5f249515fa8fe44a6f89e93 10.60.3.36:6379 slots: (0 slots) slave replicates 9a68f1edae7ba9a35eebcc72c549900c92494d92 M: 88a75be8fcab8779b0cf14bdbc2aa0ba415c55fd 10.60.4.19:6379 slots:[0-5460] (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. Check for open slots... Check slots coverage... [OK] All 16384 slots covered. ```

Did you run into the same issues? If so any tips?

Thanks.

I have the same problem using 6.2.4, is there a solution?

Comment From: fosin

I have the same problem using 5.0.14. Is there have a official solution? Thanks.

Comment From: ashok-shukla

I am also facing same issue with 6.2.6. I am also looking for an official solution.

Comment From: bhartirajpal

I am also facing same issue. Is there any official solution for this?

Comment From: mojixcoder

I have the same problem.
Other nodes have the updated IP of the new node but the updated node itself doesn't have its own correct address.