I have 5 on premise servers, Where I installed redis server. (1 is master is remaining 4 are replicas), To achieve High availability I installed 5 sentinel nodes on different middleware servers and monitors the master redis server. redis version 6.2

I have max client limit as 10000 connections. But for some reason sentinel node opening too many pubpsub connections with other sentinel nodes and running out of client limit. Why so many pubsub connections are getting created.

Among 5 sentinal nodes only one node is able to access and remaining nodes throwing -----error "Max client limit reached".

Please check the output of working sentinel node

127.0.0.1:26379> info clients

Clients

connected_clients:1119 cluster_connections:0 maxclients:10000 client_recent_max_input_buffer:32 client_recent_max_output_buffer:0 blocked_clients:0 tracking_clients:0 clients_in_timeout_table:0


output of max client limit reached sentinel node 127.0.0.1:26379> info clients Error: Connection reset by peer 127.0.0.1:26379> info clients ERR max number of clients reached 127.0.0.1:26379>


Sample output of "client list" command in working sentinel node

id=264830 addr=redis1:5543 laddr=sentinel1:26379 fd=839 name= age=2327 idle=2327 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=subscribe user=default redir=-1

id=267063 addr=redis2:20618 laddr=sentinel1:26379 fd=935 name= age=2065 idle=2065 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=subscribe user=default redir=-1

id=264831 addr=redis3:46830 laddr=sentinel1:26379 fd=840 name= age=2327 idle=2327 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=subscribe user=default redir=-1

I don't see any processor running on port 5543, 20618 and 46830

Please help us how to prevent too many pub sub connections between sentinel nodes, Because of this applications failing to connect with redis server.

Comment From: ggurram-equinix

Please any one help me on this.

Comment From: moticless

Hi @gopikrishnagurram,
Please share configuration.

You can also view and compare here few simple docker-compose based configuration of sentinel.

If that won't help, I guess the next step would be to trace the open connections to understand who is the polluting client(s), by using networking utilities (ex. lsof) . Maybe you have have multiple clients that attempts to access sentinels. Maybe it is just one that create multiple leftover connections.

Comment From: gopikrishnagurram

This is my client side code and this is common for all services and services running in Kubernetes.

JedisSentinelPool pool = new JedisSentinelPool(masterName, hostNames);
        Jedis jedis = null;
        try
        {
            jedis = pool.getResource();
            contextParamMap.clear();
            contextParamMap.putAll(jedis.hgetAll("keyname"));
        }
        catch (JedisException e)
        {
            logger.error("Error while connection to Redis Sentinel.", e);
        }
        catch (Exception e)
        {
            logger.error("Exception while reloading from redis", e);
        }
        finally
        {
            if (jedis != null)
            {
                jedis.close();
                jedis.quit();
                logger.info("Jedis closed ");
                logger.info("jedis isConncted : " + jedis.isConnected());
                jedis = null;
            }
            if (pool != null)
            {
                pool.close();
                pool.destroy();
                pool.clear();
                pool.close();
                logger.info("Pool destroyed");
                pool = null;
            }
        }

Sentinel configuration: bind 0.0.0.0 protected-mode no daemonize yes logfile "serverpath" sentinel monitor mymaster masterip 6379 2 sentinel resolve-hostnames yes sentinel announce-hostnames yes


sudo lsof -i -P -n | grep redis

Showing so many established connection more then 10,000. For reference pasting sample output.

redis-sen 5768 user 002u IPv4 57539585 0t0 TCP sentine1_node1:26379->sentine1_node2:3176 (ESTABLISHED) redis-sen 5768 user 003u IPv4 57539587 0t0 TCP sentine1_node1:26379->sentine1_node2:10521 (ESTABLISHED) redis-sen 5768 user 004u IPv4 57539589 0t0 TCP sentine1_node1:26379->sentine1_node3:36538 (ESTABLISHED) redis-sen 5768 user 005u IPv4 57539591 0t0 TCP sentine1_node1:26379->sentine1_node4:32190 (ESTABLISHED)

Comment From: moticless

  • Does all sentinels and redis servers are v6.2?
  • I would like to verify I get it right: Say on sentinel node number 5 you are having most of the connections from VMs {sentine1_node[1-4]}. If that is so, do you have your application running from those VMs?
  • Assuming sentinels are the only clients from those VMs, do you see any suspecious or repetitive message in their logs, just before they reach their limit?

Comment From: moticless

@gopikrishnagurram, any new findings?

Comment From: gopikrishnagurram

@moticless

I found one bug in my code, There are multiple times connections connections are opening and after that close call also happening.

After closing also sentinel connections not closing. This connections continuously monitoring the master. I gone through jedis source code, Here is the problem I believe. They are creating daemon threads for each sentinel node.

``` for (HostAndPort sentinel : sentinels) {

  MasterListener masterListener = new MasterListener(masterName, sentinel.getHost(), sentinel.getPort());
  // whether MasterListener threads are alive or not, process can be stopped
  **masterListener.setDaemon(true);**
  masterListeners.add(masterListener);
  masterListener.start();
}

@Override public void run() {

  running.set(true);

  while (running.get()) {

    try {
      // double check that it is not being shutdown
      if (!running.get()) {
        break;
      }

      final HostAndPort hostPort = new HostAndPort(host, port);
      j = new Jedis(hostPort, sentinelClientConfig);

      // code for active refresh
      List<String> masterAddr = j.sentinelGetMasterAddrByName(masterName);
      if (masterAddr == null || masterAddr.size() != 2) {
        LOG.warn("Can not get master addr, master name: {}. Sentinel: {}.", masterName,
            hostPort);
      } else {
        initMaster(toHostAndPort(masterAddr));
      }

      j.subscribe(new JedisPubSub() {
        @Override
        public void onMessage(String channel, String message) {
          LOG.debug("Sentinel {} published: {}.", hostPort, message);

          String[] switchMasterMsg = message.split(" ");

          if (switchMasterMsg.length > 3) {

            if (masterName.equals(switchMasterMsg[0])) {
              initMaster(toHostAndPort(Arrays.asList(switchMasterMsg[3], switchMasterMsg[4])));
            } else {
              LOG.debug(
                "Ignoring message on +switch-master for master name {}, our master name is {}",
                switchMasterMsg[0], masterName);
            }

          } else {
            LOG.error("Invalid message received on Sentinel {} on channel +switch-master: {}",
                hostPort, message);
          }
        }
      }, "+switch-master");

    } catch (JedisException e) {

      if (running.get()) {
        LOG.error("Lost connection to Sentinel at {}:{}. Sleeping 5000ms and retrying.", host,
          port, e);
        try {
          Thread.sleep(subscribeRetryWaitTimeMillis);
        } catch (InterruptedException e1) {
          LOG.error("Sleep interrupted: ", e1);
        }
      } else {
        LOG.debug("Unsubscribing from Sentinel at {}:{}", host, port);
      }
    } finally {
      if (j != null) {
        j.close();
      }
    }
  }
}

```

We can notice that this threads are daemon and continuously running i while loop. That's why those connections not getting closed until processor is killed.

Comment From: moticless

Cool.

Ok to close this issue?