Redis redis sentinel failover not happening in docker swarm

I have been trying to setup redis in sentinel mode using docker-compose file. Below are the contents of my compose file -

version: '3.3'
services:
  redis-master:
    image: redis:latest
    deploy:
      replicas: 1
    networks:
      - Overlay_Network

  redis-slave:
    image: redis:latest
    command: redis-server --slaveof redis-master 6379
    depends_on:
      - redis-master
    deploy:
      replicas: 2
    networks:
      - Overlay_Network

  sentinel:
    image: sentinel:latest
    environment:
      - SENTINEL_DOWN_AFTER=5000
      - SENTINEL_FAILOVER=5000
      - REDIS_MASTER=redis-master
    depends_on:
      - redis-master
      - redis-slave
    deploy:
      replicas: 3
    networks:
      - Overlay_Network

networks:
 Overlay_Network:
    external:
      name: Overlay_Network

Here I am creating three services redis-master, redis-slave and sentinel(local docker image used that starts redis in sentinel mode based on passed env variables). I followed this for creating sentinel image https://gitlab.ethz.ch/amiv/redis-cluster/tree/master

When I use docker-compose to run the services. It works fine.

docker-compose -f docker-compose.yml up -d

It starts all services with single instance of each. Later I manually scale redis-slave to 2 instances and sentinel to 3 instances. Then when I stop the container for redis-master, sentinel notices it and make one of slave node as master. It is working as expected.

The issue happens when I run it in swarm mode using docker stack deploy command using the same compose file.

docker stack deploy -c docker-compose.yml <stack-name>

It starts all the services, 1 instance for redis-master, 2 for redis-slave and 3 for sentinel. It uses overlay network. When I stop container for redis-master, sentinel could not upgrade any of slave nodes to master mode. Seems sentinel could not add and notice slave nodes. It adds and then it shows in down status. Here is snippet from sentinel log file.

1:X 04 Jul 2019 14:31:36.465 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:X 04 Jul 2019 14:31:36.465 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=1, just started
1:X 04 Jul 2019 14:31:36.465 # Configuration loaded
1:X 04 Jul 2019 14:31:36.466 * Running mode=sentinel, port=26379.
1:X 04 Jul 2019 14:31:36.466 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:X 04 Jul 2019 14:31:36.468 # Sentinel ID is e84a635f6cf4c0ee4454922a557a7c0fba00fadd
1:X 04 Jul 2019 14:31:36.468 # +monitor master mymaster 10.0.22.123 6379 quorum 2
1:X 04 Jul 2019 14:31:36.469 * +slave slave 10.0.22.125:6379 10.0.22.125 6379 @ mymaster 10.0.22.123 6379
1:X 04 Jul 2019 14:31:38.423 * +sentinel sentinel f92b9499bff409558a2eb985ef949dfc7050c528 10.0.22.130 26379 @ mymaster 10.0.22.123 6379
1:X 04 Jul 2019 14:31:38.498 * +sentinel sentinel 6e32d6bfea4142a0bc77a74efdfd24424cbe026b 10.0.22.131 26379 @ mymaster 10.0.22.123 6379
1:X 04 Jul 2019 14:31:41.538 # +sdown slave 10.0.22.125:6379 10.0.22.125 6379 @ mymaster 10.0.22.123 6379

I thought it could be due to start order of containers. But depends_on field is not valid for stack mode and I could not find any other way to define the start order in stack mode.

When I do docker network inspect for overlay network, here is the output

"Containers": {
    "57b7620ef75956464ce274e66e60c9cb5a9d8b79486c5b80016db4482126916b": {
        "Name": "sws_sentinel.3.y8sdpj8609ilq22xinzykbxkm",
        "EndpointID": "a95ab07b07c68a32227be3b5da4d378b82f24aab4279bfaa13899a2a7184ce09",
        "MacAddress": "02:42:0a:00:16:84",
        "IPv4Address": "10.0.22.132/24",
        "IPv6Address": ""
    },
    "982222f1b87e1483ec791f382678ef02abcdffe74a5df13a0c0476f7f3a599a7": {
        "Name": "sws_redis-slave.1.uxwkndhkdnizyicwulzli964r",
        "EndpointID": "f5f8fa056622b1529351355c3760c3f45357c7b3de3fe4d2ee90e2d490328f2a",
        "MacAddress": "02:42:0a:00:16:80",
        "IPv4Address": "10.0.22.128/24",
        "IPv6Address": ""
    },
    "c55376217215a1c11b62ac9d22d28eaa1bcda89484a0202b208e557feea4dd35": {
        "Name": "sws_redis-slave.2.s8ha5xmvx6sue2pj6fav8bcbx",
        "EndpointID": "6dcb13e23a8b4c0b49d7dc41e5813b317b8d67377ac30a476261108b8cdeb3f8",
        "MacAddress": "02:42:0a:00:16:7f",
        "IPv4Address": "10.0.22.127/24",
        "IPv6Address": ""
    },
    "cd6d72547ef3fb34ece45ad0201555124505379182f7445373025e1b9a115554": {
        "Name": "sws_redis-master.1.3rhfihzqip2a44xq2uerhqkjt",
        "EndpointID": "9074f9c911e03de0f27e4fb6b75afdf6bb38a111a511738451feb5e64c8dbff3",
        "MacAddress": "02:42:0a:00:16:7c",
        "IPv4Address": "10.0.22.124/24",
        "IPv6Address": ""
    },
    "lb-SA_Monitor_Overlay": {
        "Name": "SA_Monitor_Overlay-endpoint",
        "EndpointID": "2fb84ac75f5eee015b80b55713da83d1afb7dfa7ed4c1f5eda170f4b8daf8884",
        "MacAddress": "02:42:0a:00:16:7d",
        "IPv4Address": "10.0.22.125/24",
        "IPv6Address": ""
    }
}

Here I see slaves are running on ip 10.0.22.128 and 10.0.22.127, but in sentinel log file it is trying to add slave using ip 10.0.22.125. Why is that? Could this be an issue?

Let me know if any more detail is required.

Comment From: kuldeepsidhu88

Solution

I concluded that it was happening due to docker swarm default load balancer. Sentinel gets information about slaves from master node. But slaves are not getting registered with their actual IP address in docker network. It seems to be load balanced IP. So sentinel was not able to reach slaves using that IP and it shows slave is down.

They have also mentioned it on their documentation page

https://redis.io/topics/replication [Configuring replication in Docker and NAT]

https://redis.io/topics/sentinel [Sentinel, Docker, NAT, and possible issues]

As a solution to this, I made my custom Dockerfile to start redis-slave nodes. It uses redis.conf and an entrypoint.sh script. In entrypoint.sh I get the container's real IP and write it to redis.conf and as last step, start redis-server using that updated redis.conf.

slave-announce-ip <CONTAINER_IP_ADDRESS>
slave-announce-port 6379

You can also do similar steps for sentinel nodes.

Now slaves will be registered using their real conatiner IP address, port and sentinel is able to communicate with them.

Comment From: spacepirate0001

@kuldeepsidhu88 is it possible to share your file for reproduce-ability? Thanks

Comment From: PhilPhonic

@kuldeepsidhu88 could you please share your redis.conf and entrypoint.sh ?

Comment From: kuldeepsidhu88

@Haythamamin @PhilPhonic Please find files below for reference.

Dockerfile

FROM redis:5

COPY replica/redis.conf /etc/redis/redis.conf
RUN chown redis:redis /etc/redis/redis.conf

COPY replica/redis-entrypoint.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/redis-entrypoint.sh

EXPOSE 6379

ENTRYPOINT ["redis-entrypoint.sh"]

redis.conf

replicaof {{REDIS_MASTER}} 6379

replica-announce-ip {{REPLICA_CONTAINER_IP}}
replica-announce-port 6379

entrypoint.sh

#!/bin/bash

# get container id from /proc/self/cgroup
CONTAINER_ID_LONG=`cat /proc/self/cgroup | grep 'docker' | sed 's/^.*\///' | tail -n1`

# search for the id in /etc/hosts, it uses only first 12 characters
CONTAINER_ID_SHORT=${CONTAINER_ID_LONG:0:12}
DOCKER_CONTAINER_IP_LINE=`cat /etc/hosts | grep $CONTAINER_ID_SHORT`

# get the ip address
THIS_DOCKER_CONTAINER_IP=`(echo $DOCKER_CONTAINER_IP_LINE | grep -o '[0-9]\+[.][0-9]\+[.][0-9]\+[.][0-9]\+')`

# set as environment variable
export DOCKER_CONTAINER_IP=$THIS_DOCKER_CONTAINER_IP

# replace placeholders in redis.conf file with environment variables
sed -i 's,{{REDIS_MASTER}},'"${REDIS_MASTER}"',g' /etc/redis/redis.conf
sed -i 's,{{REPLICA_CONTAINER_IP}},'"${DOCKER_CONTAINER_IP}"',g' /etc/redis/redis.conf

# start redis
exec docker-entrypoint.sh redis-server /etc/redis/redis.conf

sentinel nodes can also be configured in similar way. Hope it helps. Let me know if you have any further questions.

Comment From: PhilPhonic

Thanks @kuldeepsidhu88 I have built an entrypoint in a similar way. Works in general, but sadly not in docker-swarm

Comment From: kuldeepsidhu88

@PhilPhonic Yes. Things get tricky in docker swarm. I expect Redis team should release some official documentation how to make things work in docker swarm environments.

Comment From: collabnix

version: '3'

services:
  redis-master:
    image: 'bitnami/redis:latest'
    ports:
      - '6379:6379'
    environment:
      - REDIS_REPLICATION_MODE=master
      - REDIS_PASSWORD=laSQL2019
      - REDIS_EXTRA_FLAGS=--maxmemory 100mb
    volumes:
      - 'redis-master-volume:/bitnami'
    deploy:
      mode: replicated
      replicas: 2

  redis-slave:
    image: 'bitnami/redis:latest'
    ports:
      - '6379'
    depends_on:
      - redis-master
    volumes:
      - 'redis-slave-volume:/bitnami'
    environment:
      - REDIS_REPLICATION_MODE=slave
      - REDIS_MASTER_HOST=redis-master
      - REDIS_MASTER_PORT_NUMBER=6379
      - REDIS_MASTER_PASSWORD=laSQL2019
      - REDIS_PASSWORD=laSQL2019
      - REDIS_EXTRA_FLAGS=--maxmemory 100mb
    deploy:
      mode: replicated
      replicas: 2

  redis-sentinel:
    image: 'bitnami/redis:latest'
    ports:
      - '16379:16379'
    depends_on:
      - redis-master
    volumes:
      - 'redis-sentinel-volume:/bitnami'
    entrypoint: |
      bash -c 'bash -s <<EOF
      "/bin/bash" -c "cat <<EOF > /opt/bitnami/redis/etc/sentinel.conf
      port 16379
      dir /tmp
      sentinel monitor master-node redis-master 6379 2
      sentinel down-after-milliseconds master-node 5000
      sentinel parallel-syncs master-node 1
      sentinel failover-timeout master-node 5000
      sentinel auth-pass master-node laSQL2019
      EOF"
      "/bin/bash" -c "redis-sentinel /opt/bitnami/redis/etc/sentinel.conf"    
      EOF'
    deploy:
      mode: replicated
      replicas: 3

volumes:
  redis-master-volume:
    driver: local
  redis-slave-volume:
    driver: local
  redis-sentinel-volume:
    driver: local

Comment From: jaschaio

@collabnix this is amazing, not sure why it is so deeply hidden in a github issue as it is the only configuration of redis sentinel on docker swarm that actually seems to work.

Anyway, quick question as I am struggeling to adapt your entrypoint script using a docker secret for the password instead of just writing it in plain text.

Assuming that I have a docker secret for the password mounted at /run/secrets/password I guess I need to export it into a environment variable via export PASSWORD="$("</run/secrets/password")" and then using it within your entrypoint script.

Here is my attempt that doesn't work:

    entrypoint: |
        bash -c 'bash -s <<EOF
        "/bin/bash" -c "export PASSWORD=$$(</run/secrets/password) && \
        echo $PASSWORD && \
        cat <<EOF > /opt/bitnami/redis/etc/sentinel.conf
        port 16379
        dir /tmp
        sentinel monitor master-node master 6379 2
        sentinel down-after-milliseconds master-node 5000
        sentinel parallel-syncs master-node 1
        sentinel failover-timeout master-node 5000
        sentinel auth-pass master-node $PASSWORD
        EOF"
        "/bin/bash" -c "redis-sentinel /opt/bitnami/redis/etc/sentinel.conf"
        EOF'

Maybe you got an idea as you seem to be more experienced with writing bash scripts.

Comment From: hedleyroos

I had to add this line to the sentinel conf file to get it to work:

sentinel resolve-hostnames yes

Comment From: adshin21

@hedleyroos did you check, if you scale down the master to 0, is sentinel able to elect the new master from the slaves? For me, it is just saying "can't resolve hostname redis-master".

Comment From: macrokernel

@adshin21, Have you tried enabling Redis data persistence? I am not using the solution provided by @collabnix, but in my deployment with data persistence enabled master election is going well after scale down and restore of all Redis server instances.

Comment From: eazylaykzy

@macrokernel, Do you mind sharing your setup, I'm currently faced with the same problem as @adshin21

Comment From: macrokernel

@eazylaykzy, Sure, please check my repo: https://github.com/macrokernel/redis-ha-cluster.

Comment From: Luk7c

Hi,

I used @adshin21's docker-compose (but I have only 1 slave), and I used appendonly true to my sentinel.conf

But I'm facing a problem : I pause my master and my slave is now the new master -> OK I unpause my old master and my old master is synchronised with redis-slave (which is my current master) -> OK I pause redis-slave, and redis-master cannot be promoted new master -> KO

I'm using Swarm to deploy my docker-compose

Here are the logs from my master :

Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master

Here is my sentinel.conf :

# Generated by CONFIG REWRITE
sentinel monitor mymaster <ip redis-master> 6379 2
sentinel known-replica mymaster <ip redis-slave> 6379
sentinel known-replica mymaster redis-master 6379

I don't understand why I have sentinel known-replica mymaster redis-master 6379

Comment From: jganeshpai1994

Hello Everyone,

Have found the solution for this issue.

In Docker Swarm environment ,the IPs will change once the container is down and recreated , this causes issues in the sentinel has it adds the older ips as well as the new ips.

In bitnami/redis have found another issue when sentinel is intialized the sentinel container does not have ips of the replicas and also other sentinel , this causes sentinel to go in tilt mode when the election happens.

To avoid all the above issue have created a shell script with following check

#!/bin/bash

echo "Sleeping 20 seconds before running checks"
sleep 20

while true; do
 replica_count=$(cat /opt/bitnami/redis/etc/sentinel.conf | grep -o 'known-replica' | wc -l)
 sentinel_count=$(cat /opt/bitnami/redis/etc/sentinel.conf | grep -o 'known-sentinel' | wc -l)
 echo "Replica Count : $replica_count"
 echo "Sentinel Count : $sentinel_count"

 echo "=====Check Replica Count===="
 if [ "$replica_count" -gt 3 ]; then
   echo "=========== Before sentinel.conf (start)========"
   cat /opt/bitnami/redis/etc/sentinel.conf
   echo "=========== Before sentinel.conf (end) ========"
   redis-cli -p 16379 SENTINEL RESET master-node
   redis-cli -p 16379 SENTINEL FAILOVER master-node
   redis-cli -p 16379 SENTINEL RESET master-node
   echo "Reset done Sentinel"
   echo "=========== After sentinel.conf (start)========"
   cat /opt/bitnami/redis/etc/sentinel.conf
   echo "=========== After sentinel.conf (end) ========"
 fi

 # Check if sentinel has no replica
 if [ "$replica_count" -eq 0 ]; then
  echo "Zero Replica Count"
  redis-cli -p 16379 SHUTDOWN
 fi

 echo "=====Check Sentinel Count===="
 if [ "$sentinel_count" -lt 2 ]; then
   echo "=========== Before sentinel.conf (start)========"
   cat /opt/bitnami/redis/etc/sentinel.conf
   echo "=========== Before sentinel.conf (end) ========"
   redis-cli -p 16379 SENTINEL FAILOVER master-node
   echo "Failing over Sentinel"
   echo "=========== After sentinel.conf (start)========"
   cat /opt/bitnami/redis/etc/sentinel.conf
   echo "=========== After sentinel.conf (end) ========"
 elif [ "$sentinel_count" -gt 2 ];then
   echo "Reseting..."
   redis-cli -p 16379 SENTINEL RESET master-node
 fi


 sleep 10
done
 ```
The following shell script is run in sentinel container and checks for replica count and sentinel count . In my case I have 1 redis master and 2 slaves that's why the first condition should not be greater than 3

if it is greater than 3 , that means we have older IPs which are not present now so it resets and does failover. The Reset over here will tell sentinel to get the latest IPs of the master and replicas and do a failover to avoid any older IPs selected as master

The second condition is replica count equal to zero which means sentinel was not initialized correctly and needs to be restarted so that check has been added 

The third condition is for sentinel count similarly I have 3 sentinel if count greater than 2 than reset and do failover.

Below is my stack yml

version: '3.7' services: redis-commander: image: ghcr.io/joeferner/redis-commander:latest ports: - "8081:8081" environment: - SENTINEL_HOST=redis-sentinel:16379 - SENTINEL_NAME=master-node networks: - overlay_net deploy: mode: replicated replicas: 1 redis-master: image: bitnami/redis:6.2.13 environment: - REDIS_REPLICATION_MODE=master - ALLOW_EMPTY_PASSWORD=yes - REDIS_EXTRA_FLAGS=--maxmemory 100mb - REDIS_SENTINEL_MASTER_NAME=master-node - REDIS_SENTINEL_HOST=redis-sentinel - REDIS_SENTINEL_PORT_NUMBER=16379 volumes: - ./metadata_cache:/bitnami/redis/data deploy: mode: replicated replicas: 1 placement: constraints: - node.labels.node_name == node-1 command: /opt/bitnami/scripts/redis/run.sh --min-replicas-to-write 1 --min-replicas-max-lag 10 networks: - overlay_net redis-slave1: image: bitnami/redis:6.2.13 depends_on: - redis-master volumes: - ./metadata_cache:/bitnami/redis/data environment: - REDIS_REPLICATION_MODE=slave - REDIS_MASTER_HOST=redis-master - ALLOW_EMPTY_PASSWORD=yes - REDIS_MASTER_PORT_NUMBER=6379 - REDIS_EXTRA_FLAGS=--maxmemory 100mb - REDIS_SENTINEL_MASTER_NAME=master-node - REDIS_SENTINEL_HOST=redis-sentinel - REDIS_SENTINEL_PORT_NUMBER=16379 deploy: mode: replicated replicas: 1 placement: constraints: - node.labels.node_name == node-2 command: /opt/bitnami/scripts/redis/run.sh --min-replicas-to-write 1 --min-replicas-max-lag 10 networks: - overlay_net redis-slave2: image: bitnami/redis:6.2.13 depends_on: - redis-master volumes: - ./metadata_cache:/bitnami/redis/data environment: - REDIS_REPLICATION_MODE=slave - REDIS_MASTER_HOST=redis-master - ALLOW_EMPTY_PASSWORD=yes - REDIS_MASTER_PORT_NUMBER=6379 - REDIS_EXTRA_FLAGS=--maxmemory 100mb - REDIS_SENTINEL_MASTER_NAME=master-node - REDIS_SENTINEL_HOST=redis-sentinel - REDIS_SENTINEL_PORT_NUMBER=16379 deploy: mode: replicated replicas: 1 placement: constraints: - node.labels.node_name == node-3 command: /opt/bitnami/scripts/redis/run.sh --min-replicas-to-write 1 --min-replicas-max-lag 10 networks: - overlay_net redis-sentinel: image: bitnami/redis:6.2.13 depends_on: - redis-master configs: - source: sentinel_check.sh target: /opt/bitnami/redis/sentinel_check.sh mode: 0755 entrypoint: | bash -c 'bash -s <<EOF "/bin/bash" -c "cat < /opt/bitnami/redis/etc/sentinel.conf port 16379 dir /tmp sentinel monitor master-node redis-master 6379 2 sentinel down-after-milliseconds master-node 5000 sentinel parallel-syncs master-node 1 sentinel failover-timeout master-node 10000 sentinel resolve-hostnames yes sentinel announce-hostnames no EOF" #echo "sentinel announce-ip $DOCKER_CONTAINER_IP" >> /opt/bitnami/redis/etc/sentinel.conf "/bin/bash" -c "nohup /opt/bitnami/redis/sentinel_check.sh & redis-sentinel /opt/bitnami/redis/etc/sentinel.conf" EOF' deploy: mode: replicated replicas: 3 networks: - overlay_net configs: sentinel_check.sh: file: ../redis/sentinel_check.sh networks: overlay_net: external: true ```

You can the shell script is added as config and run as a background process in sentinel container. The overlay_net is overlay network created externally and also have redis-commander where you can check the data.

The Redis master and slaves are deployed on swarm cluster with labels as we are using 3 VM instances ,you can remove those labels or add your labels as required

So when the master is down you will see election happening. One more thing is in sentinel you will get the error "Unable to resolve redis-master" but don't worry this is warning as In Docker it will try to resolve redis-master hostname.

On the client side we are doing retires if failure occurs on sentinel