Hello Experts,

We are executing certain test on redis 7.0.6 with 1000 clients and 500 pipeline, and observing periodic connection resets in redis.log

Please see the logs: 11:M 23 Jan 2023 06:20:24.145 - Reading from client: Connection reset by peer 11:M 23 Jan 2023 06:20:24.145 - Client closed connection id=10268 addr=?:0 laddr=172.17.0.3:6381 fd=139 name= age=16 idle=1 flags=N db=0 sub=0 psub=0 ssub=0 multi=-1 qbuf=0 qbuf-free=20474 argv-mem=0 multi-mem=0 rbs=2048 rbp=1024 obl=0 oll=73 omem=1496792 tot-mem=1520088 events=rw cmd=get user=default redir=-1 resp=2 11:M 23 Jan 2023 06:20:24.145 - Reading from client: Connection reset by peer

We have a few queries around this

  • When would redis reset the connection
  • What can be expected when the connection is reset
  • Why would redis reset an existing connection
  • Can we minmize these resets,
  • What will be the impact on consumer application if connections are reset by redis

Attaching complete redis logs for refererence redis-server_logs_1000C_500P_47FBFilledRedisData.log

Looking forward for your expert opinion.

Comment From: hwware

From your error log omem=1496792 tot-mem=1520088, I guess your system should include some master nodes, and some replica (slave) nodes, and profile file should have the following parameter: maxmemory-policy (it should be some kind of evication policy) and maxmemory, please confirm this. Or you could post some of your redis instance profile here, if including master nodes, replica nodes and sentinel nodes.
Thanks

Comment From: geekthread

yes , we are using sentinel based architecture. max-memory-policy : volatile-lru maxmmory : 31 gb

Is there any other information that I can share ?

Comment From: hwware

Thanks for your information.

I am investigating the simialr issue as you. And I just create a PR to solve this problem. https://github.com/redis/redis/pull/11749 (please check it)

From my experience, the connect reset is due to the failover buffer between master and replia is fullfilled quickly, thus redis master need to disconnect it. I think your system should be a heavy write system. So my suggestion is to try to use cluster architecture, it may help.

Let us see if there are other suggestions from community to solve your problem. Sorry for that

Comment From: geekthread

thanks @hwware, memory on our master node didn't breach max memory limit and we had around 6 -7 gb worth of free space. Lru Evictor would only run when the maxmemory limit is reached.

Comment From: geekthread

Referring to client-output-buffer-limit section in https://github.com/redis/redis/blob/7.0/redis.conf , we see that there's no limit on for normal clients (as can be seen in the logs the client is normal type) Can this be due to 1. #pipeline:750 2. #clients: 1000

Any other recommendations please?

Thanks, Ankit

Comment From: geekthread

Hello Team,

Any updates on this ?

We have observed that when we lower down the pipeline #'s we the connection drop is quite evident on repeatable test cases.

Comment From: sundb

Referring to client-output-buffer-limit section in 7.0/redis.conf , we see that there's no limit on for normal clients (as can be seen in the logs the client is normal type) Can this be due to

  1. pipeline:750

  2. clients: 1000

Any other recommendations please?

Thanks, Ankit

From the log file you posted it seems that the clients are disconnected autonomously rather than REDIS server.
If the connections are disconnected by server, the corresponding disconnect reason log will appear in the logs. What is your REDIS client timeout time?