I`m using redis cluster inside kubernetes cluster and I notice that Redis returns incorrect ip addresses (there are basically not exist)
redis-cli CLUSTER SLOTS
0
16383
10.244.3.10
6379
8eaec21af0af82c56a85422666278d2dabf320a8
10.244.11.11
6379
bd5b63ec39d7b4eef98b820227e96b5d331612d6
10.244.37.11
6379
bc8bd7a7c9f87ef616f1471cfa1721762faba5ad
redis-cli CLUSTER NODES
bc8bd7a7c9f87ef616f1471cfa1721762faba5ad 10.244.37.11:6379@16379 slave 8eaec21af0af82c56a85422666278d2dabf320a8 0 1504621288802 25 connected
8eaec21af0af82c56a85422666278d2dabf320a8 10.244.3.10:6379@16379 myself,master - 0 1504621289000 25 connected 0-16383
bd5b63ec39d7b4eef98b820227e96b5d331612d6 10.244.11.11:6379@16379 slave 8eaec21af0af82c56a85422666278d2dabf320a8 0 1504621289805 25 connected
Both command returns ip for master 10.244.3.10 but the problem is that following address not exist in our environment. Instead of following should be used: 10.244.31.4 (also this address is visible when for ifconfig output on master machine)
When I make redis ping for all host in my cluster I get following response for ping :
bash-4.3# redis-cli -h 10.244.3.10 -p 6379 ping
Could not connect to Redis at 10.244.3.10:6379: Host is unreachable
bash-4.3# redis-cli -h 10.244.11.11 -p 6379 ping
PONG
bash-4.3# redis-cli -h 10.244.37.11 -p 6379 ping
PONG
But I can ping 10.244.31.4
bash-4.3# redis-cli -h 10.244.31.4 -p 6379 ping
PONG
Why redis return incorrect ip value for master node ? Regards, Piotr
Comment From: me115
Are you run redis-server in docker container? check if 10.244.3.10 is gateway address?
Comment From: chmielas
No, I`m running redis in kubernetes cluster and 10.244.3.10 is not the address of any of the endpoints in cluster
Comment From: leonth
We experienced similar issues as well. We are running redis cluster in kubernetes. Our hypothesis is that the IP returned was actually the IP address of the previous pod (analogous to instance/container). In kubernetes, when a pod dies and restarts, it is given a new internal IP address. A CLUSTER MEET should make the pod join the cluster again with thr new IP, but we suspect that the old IP address might get stuck somewhere.
Comment From: gagabu
I'm also running redis cluster in kubernetes and have met with same problem. As I understood, redis node returns ip address that pod had on time of cluster creation and stored at nodes.conf file.
My workaround is to add init container to redis statefulset yaml and replace old IP to new one.
initContainers:
- name: update-pod-ip
image: busybox
env:
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
command: ['sh', '-c', 'if [ -s /data/nodes.conf ]; then sed -ri "/myself/s/[0-9]{1,3}\.[0-9]{1,3}.[0-9]{1,3}\.[0-9]{1,3}/$MY_POD_IP/" /data/nodes.conf; fi']
volumeMounts:
- name: data
mountPath: /data
readOnly: false
Comment From: leonth
We've successfully run redis cluster in kubernetes production since Jan 2018 using redis 4 and --cluster-announce-ip. k8s yaml looks roughly like this:
containers:
- name: redis
image: redis:4.0.6
command: ["redis-server"]
args:
- /etc/redis/redis.conf
- --cluster-announce-ip
- "$(MY_POD_IP)"
env:
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
Comment From: MQPearth
However, when the entire k8s cluster is restarted, the IP addresses of all redis pods have changed, and at this time the redis cluster cannot be restored.
Comment From: MQPearth
This is my solution, specifying cluster-announce-ip but not using a specific IP.
After the entire k8s restarts, all pods will be assigned new IPs, but the Redis cluster remains valid.
containers:
- name: redis
image: redis:6.0.19
imagePullPolicy: IfNotPresent
command:
- "redis-server"
args:
- "/etc/redis/redis.conf"
- "--cluster-announce-ip"
- "$(POD_NAME).$(POD_SERVICE_NAME).$(POD_NAMESPACE).svc.cluster.local"
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_SERVICE_NAME
value: "redis"
My Redis image version is: redis:6.0.19 Kubernetes version is: 1.23.1 Note that this configuration method has not been rigorously tested.
Comment From: kjoe
This is my solution, specifying cluster-announce-ip but not using a specific IP.
After the entire k8s restarts, all pods will be assigned new IPs, but the Redis cluster remains valid.
Thanks! It's a little bit tricky configuration, but definitely works, if we take care of its edge cases. I'm only able to use hostnames, if I follow these steps, and rules:
- Before redis-cli v7.0 CLUSTER MEET only supports IPs (see https://github.com/redis/redis/pull/10436), so first time we need to initialize the cluster with IP addresses.
- After this we need to add hostnames with --cluster-announce-ip parameter, the redeploy will restart the nodes one-by-one, during this each node replaces its IP address to hostname in CLUSTER NODES list, and finally the cluster becomes stable state.
- If we use longer pod/service/namespace names, then we easily hit the maximum name length, because the node address field is limited to 46 chars, so the hostname gets truncated, and the cluster will be stucked in failed state. https://github.com/redis/redis/blob/de0d9632b52849d9b7ea52408b1d681d771c5b46/src/server.h#L119
- If we are unable to shorten the names, then we can use them without domain names, or actually we can ommit the namespace part too, because the k8s DNS resolver will add the missing parts thru DNS search domains.
- If we need to connect to the Redis Cluster from another namespace with shortened hostnames, then we must add the cluster's namespace to the client's local search domains with spec.dnsConfig.searches, because the MOVED redirect will return hostnames rather than IP addresses to the client.
- Note that the client will unable to guess a missing pod's IP address when it's restarting, so sometimes it could get DNS resolving error (depending on implementation).
Comment From: kjoe
We experienced similar issues as well. We are running redis cluster in kubernetes. Our hypothesis is that the IP returned was actually the IP address of the previous pod (analogous to instance/container). In kubernetes, when a pod dies and restarts, it is given a new internal IP address. A CLUSTER MEET should make the pod join the cluster again with thr new IP, but we suspect that the old IP address might get stuck somewhere.
I can confirm it seems to a bug under Kubernetes with Redis Cluster, and --cluster-announce-ip magically fixes this phenomenom, however the cluster will survive single node restarts without this parameter (and without CLUSTER MEET) too. My observation is that the wrong (previous) IP address appears only from the node point of view itself (in the "myself" row, regardles if it's a master, or a slave) in CLUSTER NODES and SLOTS lists, but other nodes knows its correct (new) IP address:
redis-service-1 (10.12.1.205): 8325f28d789cdc187e3f7c0b7c7d7c285f65764c 10.12.1.125:6379@16379 myself,master - 0 1710753141000 10 connected 0-5460 74e6d24c9adbf356256e0a77f94ef600e9a4f29f 10.12.1.208:6379@16379 slave 7b05d07f483a7d227ba84fc65608c7066364a3da 0 1710753143088 7 connected 294e4eeac0de01d9816177334599330aba448679 10.12.2.170:6379@16379 master - 0 1710753142000 8 connected 5461-10922 fcca7847570ae5b36958105475a84e3240aac3da 10.12.2.173:6379@16379 slave 294e4eeac0de01d9816177334599330aba448679 0 1710753142085 8 connected 0ad7a643c62e678ee314407dc1c0abb80b59fdbf 10.12.7.87:6379@16379 slave 8325f28d789cdc187e3f7c0b7c7d7c285f65764c 0 1710753141000 10 connected 7b05d07f483a7d227ba84fc65608c7066364a3da 10.12.7.95:6379@16379 master - 0 1710753144092 7 connected 10923-16383 redis-service-2 (10.12.2.173): 8325f28d789cdc187e3f7c0b7c7d7c285f65764c 10.12.1.205:6379@16379 master - 0 1710753117095 10 connected 0-5460 74e6d24c9adbf356256e0a77f94ef600e9a4f29f 10.12.1.208:6379@16379 slave 7b05d07f483a7d227ba84fc65608c7066364a3da 0 1710753119100 7 connected 294e4eeac0de01d9816177334599330aba448679 10.12.2.170:6379@16379 master - 0 1710753116092 8 connected 5461-10922 fcca7847570ae5b36958105475a84e3240aac3da 10.12.2.40:6379@16379 myself,slave 294e4eeac0de01d9816177334599330aba448679 0 1710753117000 2 connected 0ad7a643c62e678ee314407dc1c0abb80b59fdbf 10.12.7.87:6379@16379 slave 8325f28d789cdc187e3f7c0b7c7d7c285f65764c 0 1710753118098 10 connected 7b05d07f483a7d227ba84fc65608c7066364a3da 10.12.7.95:6379@16379 master - 0 1710753117000 7 connected 10923-16383
Comment From: zioproto
It seems the usage of --cluster-announce-ip is used also in the redis-cluster bitnami helm chart to solve this issue:
https://github.com/bitnami/charts/blob/c6d6b1735a0d364655e11cc669a95f862f24627f/bitnami/redis-cluster/templates/redis-statefulset.yaml#L115