Expected behavior
We are getting JedisClusterException CLUSTERDOWN The cluster is down exception so frequently on our production environment due to which few of the inserts and deletes where missed which leads to data mismatch
We have identified and monitored the network between the clusters.Network seems to be absolutely fine and don't have any problem.
Actual behavior
As there is no network issue cluster down should not be happening ideally
Steps to reproduce:
The issue happens very randomly.Some times it happens with a simple use case of doing 4000000 inserts in a for loop with two level keys
**for(int i=0;i<8000000;i++) { redisCache.put("TEST"+i,"NOT"+i,"ABC");
}**
Redis / Jedis Configuration
redis_version:3.0.5 jedis version : 2.8.2
Attaching the configuration and logs in zip file
Jedis version:
2.8.2
Redis version:
redis_version:3.0.5
Java version:
1.8
Logs :
Client Side Exception Logs:
Exception occured : redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN The cluster is down at redis.clients.jedis.Protocol.processError(Protocol.java:119) at redis.clients.jedis.Protocol.process(Protocol.java:157) at redis.clients.jedis.Protocol.read(Protocol.java:211) at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:297) at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:196) at redis.clients.jedis.Jedis.set(Jedis.java:69) at redis.clients.jedis.JedisCluster$1.execute(JedisCluster.java:89) at redis.clients.jedis.JedisCluster$1.execute(JedisCluster.java:86) at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:120) at redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:31) at redis.clients.jedis.JedisCluster.set(JedisCluster.java:91)
Server side Exception Logs:
12001:M 16 Jan 15:20:46.535 . ping packet received: (nil) 12001:M 16 Jan 15:20:46.535 . GOSSIP ebee68d0219e10761b9fbe57aefe2b323d733f4a 192.168.35.81:7004 slave 12001:M 16 Jan 15:20:46.535 . GOSSIP 1b4816d1d6e2a1c345fae404f54a5cc0833ad7f4 192.168.35.81:7005 slave 12001:M 16 Jan 15:20:46.535 . GOSSIP a6e01777460317c63abd0174806db8bceaa6029d 192.168.35.80:7001 master 12001:M 16 Jan 15:20:46.535 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.535 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 . --- Processing packet of type 0, 2520 bytes 12001:M 16 Jan 15:20:46.536 . Ping packet received: (nil) 12001:M 16 Jan 15:20:46.536 . ping packet received: (nil) 12001:M 16 Jan 15:20:46.536 . GOSSIP 0e97af43ac63e0b637ba6bdfc563bcd1a5a22c73 192.168.35.80:7002 master 12001:M 16 Jan 15:20:46.536 . GOSSIP a6e01777460317c63abd0174806db8bceaa6029d 192.168.35.80:7001 master 12001:M 16 Jan 15:20:46.536 . GOSSIP ebee68d0219e10761b9fbe57aefe2b323d733f4a 192.168.35.81:7004 slave 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.536 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 . --- Processing packet of type 0, 2520 bytes 12001:M 16 Jan 15:20:46.537 . Ping packet received: (nil) 12001:M 16 Jan 15:20:46.537 . ping packet received: (nil) 12001:M 16 Jan 15:20:46.537 . GOSSIP d8e5eb39c28808aea117c1bb2bdf6b071a0aad1a 192.168.35.80:7000 master 12001:M 16 Jan 15:20:46.537 . GOSSIP 686cc59c46368f5076abb7ffd7d5d98100eedadf 192.168.35.81:7003 slave 12001:M 16 Jan 15:20:46.537 . GOSSIP 1b4816d1d6e2a1c345fae404f54a5cc0833ad7f4 192.168.35.81:7005 slave 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.537 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 . --- Processing packet of type 0, 2416 bytes 12001:M 16 Jan 15:20:46.538 . Ping packet received: (nil) 12001:M 16 Jan 15:20:46.538 . ping packet received: (nil) 12001:M 16 Jan 15:20:46.538 . GOSSIP d8e5eb39c28808aea117c1bb2bdf6b071a0aad1a 192.168.35.80:7000 master 12001:M 16 Jan 15:20:46.538 . GOSSIP 686cc59c46368f5076abb7ffd7d5d98100eedadf 192.168.35.81:7003 slave 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 . --- Processing packet of type 0, 2520 bytes 12001:M 16 Jan 15:20:46.538 . Ping packet received: (nil) 12001:M 16 Jan 15:20:46.538 . ping packet received: (nil) 12001:M 16 Jan 15:20:46.538 . GOSSIP ebee68d0219e10761b9fbe57aefe2b323d733f4a 192.168.35.81:7004 slave 12001:M 16 Jan 15:20:46.538 . GOSSIP a6e01777460317c63abd0174806db8bceaa6029d 192.168.35.80:7001 master 12001:M 16 Jan 15:20:46.538 . GOSSIP 1b4816d1d6e2a1c345fae404f54a5cc0833ad7f4 192.168.35.81:7005 slave 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.538 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.539 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.540 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.541 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.542 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer 12001:M 16 Jan 15:20:46.543 - Error writing to client: Connection reset by peer
12001:M 16 Jan 15:20:46.555 . GOSSIP 686cc59c46368f5076abb7ffd7d5d98100eedadf 192.168.35.81:7003 slave 12001:M 16 Jan 15:20:46.555 . GOSSIP a6e01777460317c63abd0174806db8bceaa6029d 192.168.35.80:7001 master 12001:M 16 Jan 15:20:46.555 . I/O error reading from node link: connection closed 12001:M 16 Jan 15:20:46.555 . --- Processing packet of type 0, 2520 bytes 12001:M 16 Jan 15:20:46.555 . Ping packet received: (nil) 12001:M 16 Jan 15:20:46.555 . ping packet received: (nil) 12001:M 16 Jan 15:20:46.555 . GOSSIP 686cc59c46368f5076abb7ffd7d5d98100eedadf 192.168.35.81:7003 slave 12001:M 16 Jan 15:20:46.555 . GOSSIP ebee68d0219e10761b9fbe57aefe2b323d733f4a 192.168.35.81:7004 slave 12001:M 16 Jan 15:20:46.555 . GOSSIP d8e5eb39c28808aea117c1bb2bdf6b071a0aad1a 192.168.35.80:7000 master 12001:M 16 Jan 15:20:46.555 . I/O error reading from node link: connection closed 12001:M 16 Jan 15:20:46.555 . --- Processing packet of type 0, 2520 bytes 12001:M 16 Jan 15:20:46.555 . Ping packet received: (nil) 12001:M 16 Jan 15:20:46.555 . ping packet received: (nil)
Is there anything related to configurations or tcp settings/port configuration that needs to be done? If any configuration issue please let me know so that can change them accordingly.
Regards
Comment From: svsteja
Attaching the config as well as the logs notifRedisIssue.tar.gz
Comment From: svsteja
Also please find the attached bug logged to jedis https://github.com/xetorthio/jedis/issues/1461
Comment From: deepakshivanandappa
Did this issue resolve? We are using AWS Elasticache Redis with Cluster mode enabled and we are observing this CLUSTERDOWN error alot more times.
Jedis Library Version: 3.1.0 Redis Server : 5.0.4