Redis [BUG] Blocking commands do not unblock in time under high volume of new connection churns

Describe the bug

Blocking commands such as BRPOP can be used in conjunction with a timeout argument. They are supposed to unblock when the timeout is reached, if not earlier.

Such blocking commands do not unblock in time when the following two conditions are met: * Constant high volume of new connections. E.g. clients do not use long-lived connections, but rather use a new connection for every single request. * At any given moment, the number of concurrent established connections are very low compared to the number of new connections. E.g. the connected_clients has a low value while total_connections_received grows like crazy.

To reproduce

Run the following redis-benchmark command to manufacture a high volume of new connections but a low number of concurrent connections against your redis server (primary role):

./redis-benchmark -h nanya-test-brpop.hdo23d.0001.internal.use1.cache.amazonaws.com  -t set -n 1000000 -l -r 100000000 -c 100 -k 0 --threads 2 -P 3

In a separate terminal, run the following brpop command observer it takes abnormal amount of time to unblock:

[ny0312@ip ~]$ time redis-cli brpop queue 2
(nil)

real    0m32.288s
user    0m0.001s
sys 0m0.000s


[ny0312@ip ~]$ time redis-cli brpop queue 2
^[[O(nil)

real    0m37.500s
user    0m0.001s
sys 0m0.000s

[ny0312@ip ~]$ time redis-cli brpop queue 2
(nil)

real    0m23.083s
user    0m0.001s
sys 0m0.000s

Expected behavior

[ny0312@ip ~]$ time redis-cli brpop queue 2
(nil)

real    0m2.032s
user    0m0.001s
sys 0m0.001s

Additional information

Upon investigation, I believe the root cause is an unnecessary list rotation that Redis does to the server.clients list before checking client timeouts.

Specifically, in clientsCron, Redis always rotates the current tail of the list to the head, and then check the head client for timeout. The number of clients that Redis checks in each clientsCron run is proportional to the number of current connections. When there are constant high volume of new connections, Redis end up always checking the newly accepted connections at the tail of the server.clients list, and because the number of current connections is low, the older connections end up being starved and never looked at. As result, blocking commands take extra long time to be unblocked.

Comment From: ny0312

This seems to have been fixed since 6.0.0 by the combination of these 2 commit:

https://github.com/redis/redis/commit/7add0f249078aa11cdeb11ec7535519315f0369e
https://github.com/redis/redis/commit/aa9d92d94a7f18f768e25bdbb94179b32e7b9e13

Resolving.