Hi All, Tried to test failover on my redis sentinel setup where intially master was redis0 and after rebooting node1 master changed to redis1 but the sentinel present in the node is not coming up with error: Duplicate master name.
Please provide your suggestions.
Pods:::::
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
redis-0 2/2 Running 2 4m3s 10.244.6.5 192.x.x.167 (node1)
Sentinel 1 log:::::::
1:X 19 Dec 2023 07:44:37.643 # -sdown slave redis-2.redis.redis-partner32-ns.svc.cluster.local:6379 redis-2.redis.redis-partner32-ns.svc.cluster.local 6379 @ mymaster redis-0.redis.redis-partner32-ns.svc.cluster.local 6379 1:X 19 Dec 2023 07:46:58.127 # +sdown master mymaster redis-0.redis.redis-partner32-ns.svc.cluster.local 6379 1:X 19 Dec 2023 07:46:58.313 * Sentinel new configuration saved on disk 1:X 19 Dec 2023 07:46:58.313 # +new-epoch 3 1:X 19 Dec 2023 07:46:58.318 * Sentinel new configuration saved on disk 1:X 19 Dec 2023 07:46:58.318 # +vote-for-leader 9e4aadf4982ea00c4fad84cf929fef1b596cd873 3 1:X 19 Dec 2023 07:46:58.617 # +sdown sentinel 7ca36ce7f713f6ec0ff80b61c44ed0f459d40e66 sentinel-0.sentinel 26379 @ mymaster redis-0.redis.redis-partner32-ns.svc.cluster.local 6379 1:X 19 Dec 2023 07:46:59.238 # +odown master mymaster redis-0.redis.redis-partner32-ns.svc.cluster.local 6379 #quorum 2/2 1:X 19 Dec 2023 07:46:59.238 # Next failover delay: I will not start a failover before Tue Dec 19 07:47:18 2023 1:X 19 Dec 2023 07:46:59.381 # +config-update-from sentinel 9e4aadf4982ea00c4fad84cf929fef1b596cd873 sentinel-2.sentinel 26379 @ mymaster redis-0.redis.redis-partner32-ns.svc.cluster.local 6379 1:X 19 Dec 2023 07:46:59.381 # +switch-master mymaster redis-0.redis.redis-partner32-ns.svc.cluster.local 6379 redis-1.redis.redis-partner32-ns.svc.cluster.local 6379 1:X 19 Dec 2023 07:46:59.384 * +slave slave redis-2.redis.redis-partner32-ns.svc.cluster.local:6379 redis-2.redis.redis-partner32-ns.svc.cluster.local 6379 @ mymaster redis-1.redis.redis-partner32-ns.svc.cluster.local 6379 1:X 19 Dec 2023 07:46:59.385 * +slave slave redis-0.redis.redis-partner32-ns.svc.cluster.local:6379 redis-0.redis.redis-partner32-ns.svc.cluster.local 6379 @ mymaster redis-1.redis.redis-partner32-ns.svc.cluster.local 6379 1:X 19 Dec 2023 07:46:59.390 * Sentinel new configuration saved on disk 1:X 19 Dec 2023 07:47:00.428 # +sdown slave redis-0.redis.redis-partner32-ns.svc.cluster.local:6379 redis-0.redis.redis-partner32-ns.svc.cluster.local 6379 @ mymaster redis-1.redis.redis-partner32-ns.svc.cluster.local 6379 1:X 19 Dec 2023 07:47:43.512 # Failed to resolve hostname 'sentinel-0.sentinel' 1:X 19 Dec 2023 07:47:44.826 # Failed to resolve hostname 'redis-0.redis.redis-partner32-ns.svc.cluster.local'
Sentinel0 log ::
*** FATAL CONFIG FILE ERROR (Redis 7.0.12) *** Reading the configuration file, at line 36 6379 2'tinel monitor mymaster redis-1.redis.redis-partner32-ns.svc.cluster.local Duplicate master name.
TIA
Comment From: LUKIEYF
anyone has ideas? T T
Comment From: jasonk
I wound up here while trying to track down a similar issue, and what ultimately ended up being the cause of my problem was using "include" to read one config file from another.
For local development purposes I'm running two redis instances and three sentinels locally on different ports. I have a base config file for redis-stack and one for sentinel that holds all the real configuration, then when starting them up I was doing something basically like this:
if [ ! -f /tmp/sentinel0.conf ]; then
{
echo "include $realpath sentinel.conf)"
echo "port 26379"
} > /tmp/sentinel0.conf
fi
exec redis-sentinel /tmp/sentinel0.conf
This worked fine the first time, but the problem was that the base sentinel.conf included the line sentinel monitor cluster 127.0.0.1 6379 2. When sentinel does it's rewriting of the config file it copies that line into the sentinel0.conf file, which still doesn't cause a problem until the sentinel restarts. When it restarts and re-reads the config files, it gets a "duplicate master name" because it contains the same master entry in both files.
In my case I fixed it by removing that line from the base file and adding it to the lines that get written to the temporary config when it doesn't exist. It seems that sentinel will do the right thing when rewriting the file if that line is already in the file it's rewriting, but not if it's from an included file.