Describe the bug

i deploy a rediscluster in k8s with 4 shards, each shard with 2 replicas in x86. after upgrade from 6.0.20 to 7.2.4, one of the slave blocked. redis-cli can't connect with the server, netstat reports tcp connections are block in 'SYN_RECV' state.

the blocked redis-server report below logs:

35:C 12 Mar 2024 17:09:07.196 # WARNING: Changing databases number from 16 to 1 since we are in cluster mode
35:C 12 Mar 2024 17:09:07.196 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
35:C 12 Mar 2024 17:09:07.196 * Redis version=7.2.4, bits=64, commit=00000000, modified=0, pid=35, just started
35:C 12 Mar 2024 17:09:07.196 * Configuration loaded
35:M 12 Mar 2024 17:09:07.196 * monotonic clock: POSIX clock_gettime
35:M 12 Mar 2024 17:09:07.197 * Running mode=cluster, port=6379.
35:M 12 Mar 2024 17:09:07.197 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
35:M 12 Mar 2024 17:09:07.198 * Node configuration loaded, I'm f5972fb31fbc887326b3dde92627c8d9b4fbd5cc
35:M 12 Mar 2024 17:09:07.199 * Server initialized
35:M 12 Mar 2024 17:09:07.199 * Loading RDB produced by version 6.0.20
35:M 12 Mar 2024 17:09:07.199 * RDB age 5 seconds
35:M 12 Mar 2024 17:09:07.199 * RDB memory usage when created 3.77 Mb
35:M 12 Mar 2024 17:09:07.199 * Done loading RDB, keys loaded: 3, keys expired: 0.
35:M 12 Mar 2024 17:09:07.199 * DB loaded from disk: 0.000 seconds
35:M 12 Mar 2024 17:09:07.199 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
35:M 12 Mar 2024 17:09:07.199 * Ready to accept connections tcp
35:S 12 Mar 2024 17:09:07.279 * Discarding previously cached master state.
35:S 12 Mar 2024 17:09:07.279 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
35:S 12 Mar 2024 17:09:07.279 * Connecting to MASTER 192.168.143.168:32519
35:S 12 Mar 2024 17:09:07.279 * MASTER <-> REPLICA sync started
35:S 12 Mar 2024 17:09:07.279 * Cluster state changed: ok
35:S 12 Mar 2024 17:09:07.280 * Non blocking connect for SYNC fired the event.
35:S 12 Mar 2024 17:09:07.281 * Master replied to PING, replication can continue...
35:S 12 Mar 2024 17:09:07.281 * Clear FAIL state for node e945283ccdb226277c161043715491483c4d2460 ():replica is reachable again.
35:S 12 Mar 2024 17:09:07.287 * Clear FAIL state for node 7def42422d5e4321ab108909c0c4984b90c0096b ():replica is reachable again.
35:S 12 Mar 2024 17:09:07.287 * Clear FAIL state for node 937e735f490324923bce58cf9bcadb57f03cd30b ():replica is reachable again.
35:S 12 Mar 2024 17:09:07.294 * Trying a partial resynchronization (request 9f6897fbdc9f4f2437eeae766e5a99265217a3d4:2144).
35:S 12 Mar 2024 17:09:07.295 * Successful partial resynchronization with master.
35:S 12 Mar 2024 17:09:07.295 * MASTER <-> REPLICA sync: Master accepted a Partial Resynchronization.
35:S 12 Mar 2024 17:09:22.371 # Cluster state changed: fail

besides, this cluster is idle, but the blocked node use all cpu resource,while the master node use only 4%. strace reports as below(no valuable info):

strace: Process 22127 attached with 9 threads
[pid 22215] futex(0x7f101e6dbe04, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 22136] futex(0x7f101e6fedf4, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 22135] futex(0x559132d45bbc, FUTEX_WAIT_PRIVATE, 2147483664, NULL <unfinished ...>
[pid 22134] futex(0x559132d45b94, FUTEX_WAIT_PRIVATE, 2147483664, NULL <unfinished ...>
[pid 22133] futex(0x559132d45b6c, FUTEX_WAIT_PRIVATE, 2147483664, NULL <unfinished ...>
[pid 22132] futex(0x7f101eb6af04, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 22131] futex(0x7f101ef6df04, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 22130] futex(0x7f101f370f04, FUTEX_WAIT_PRIVATE, 2, NULL

Comment From: rkozlo

There is a bug in 7.2. You have to firstly upgrade to 7.0 and then to 7.2. Lately i had such upgrade and no problems