Redis [BUG]The master-slave synchronization relationship of cluster shards cannot be restored due to network failure

Describe the bug

In the cluster mode, when the slave nodes in the sharded cluster are pinged by the master node during the execution of the nodeUpdateAddressIfNeeded operation, the getpeername system call may fail due to an error, causing the server.masterhost variable to be incorrectly set to ?. The slave node reports an error every 1 second: "Connecting to MASTER ?:6379". Just at that time, the master node and the slave node experience a network partition, and the master status of the sharded node is marked as PFAILED. At this time, other nodes will send gossip messages to the sharded slave node to correct the IP information of the sharded master node, but the server.masterhost configuration information will not be updated, which will cause the synchronization relationship between the master and slave nodes to not recover after the getpeername system call restores, and the following redis kernel-level error message will be displayed:

To reproduce

remark: - Redis kernel version: 6.2.14 - The redis parameter 'cluster-announce-ip' is not configured

Create 3 primary and 3 replication redis clusters
To simulate the 'slave0' node system call 'getpeername' error, here in order to quickly simulate the error, the parameter overheat configuration method is directly modified to obtain the ip address as: '? `
Simulated 'master0' and 'slave0' node network failures

#The slave0 node added iptables rules
iptables -A INPUT -s {master0-ip} -j DROP
iptables -A OUTPUT -d {master0-ip -j DROP

Wait for 'server.cluster-node-timeout' time to restore 'slave0' node system call 'getpeername'
Recover 'master0' and 'slave0' node network failures

iptables -D INPUT -s {master0-ip} -j DROP
iptables -D OUTPUT -d {master0-ip -j DROP

Expected behavior

The 'master' and 'slave' synchronization relationship can be restored after 'getpeername' system call and the network is restored

Additional information

There is no way to simulate 'getpeername' system call exception, so by modifying the source code to simulate.

Redis [BUG]The master-slave synchronization relationship of cluster shards cannot be restored due to network failure

This problem has occurred in our production environment.
Adding the code in the red box below should fix the problem

Redis [BUG]The master-slave synchronization relationship of cluster shards cannot be restored due to network failure