Redis Version: 6.2.4 (Deployed using Bitnami Helm, uses default configuration)
The problem
Redis sometimes restarts on it's own, causing a OperationError: Error 111 connecting to redis-master:6379. Connection refused.
This generally happens during Master <-> Replica Sync. The connection to master is suddenly lost and when trying to re-establish the connection. I've attached logs for both replica that crashed and master.
From what I've seen the connection from replicaset to master is suddenly lost, and then they try to synchronize, the SIGTERM is issued during this synchronization.
I have been using redis from client application only and not so knowledgeable on how it works, so having a bit of trouble understanding these logs. Thanks for your help.
Replica Log that crashed
1:C 11 Jan 2023 00:43:33.226 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 11 Jan 2023 00:43:33.227 # Redis version=6.2.4, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 11 Jan 2023 00:43:33.227 # Configuration loaded
1:S 11 Jan 2023 00:43:33.227 * monotonic clock: POSIX clock_gettime
1:S 11 Jan 2023 00:43:33.228 * Running mode=standalone, port=6379.
1:S 11 Jan 2023 00:43:33.228 # Server initialized
1:S 11 Jan 2023 00:43:33.237 * Reading RDB preamble from AOF file...
1:S 11 Jan 2023 00:43:33.237 * Loading RDB produced by version 6.2.4
1:S 11 Jan 2023 00:43:33.237 * RDB age 191 seconds
1:S 11 Jan 2023 00:43:33.237 * RDB memory usage when created 3.81 Mb
1:S 11 Jan 2023 00:43:33.237 * RDB has an AOF tail
1:S 11 Jan 2023 00:43:33.505 * Reading the remaining AOF tail...
1:S 11 Jan 2023 00:43:33.505 * DB loaded from append only file: 0.277 seconds
1:S 11 Jan 2023 00:43:33.505 * Ready to accept connections
1:S 11 Jan 2023 00:43:34.210 * Connecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:43:34.212 * MASTER <-> REPLICA sync started
1:S 11 Jan 2023 00:43:34.213 * Non blocking connect for SYNC fired the event.
1:S 11 Jan 2023 00:43:34.213 * Master replied to PING, replication can continue...
1:S 11 Jan 2023 00:43:34.213 * Partial resynchronization not possible (no cached master)
1:S 11 Jan 2023 00:43:34.214 * Full resync from master: dcbb589050208c00f10992319aef9f46ca1fb9ed:4978898612
1:S 11 Jan 2023 00:43:34.326 * MASTER <-> REPLICA sync: receiving 1203529 bytes from master to disk
1:S 11 Jan 2023 00:43:34.331 * MASTER <-> REPLICA sync: Flushing old data
1:S 11 Jan 2023 00:43:34.410 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 11 Jan 2023 00:43:35.276 * Loading RDB produced by version 6.2.4
1:S 11 Jan 2023 00:43:35.276 * RDB age 1 seconds
1:S 11 Jan 2023 00:43:35.276 * RDB memory usage when created 5.01 Mb
1:S 11 Jan 2023 00:43:35.282 * MASTER <-> REPLICA sync: Finished with success
1:S 11 Jan 2023 00:43:35.282 * Background append only file rewriting started by pid 15
1:S 11 Jan 2023 00:43:35.756 * AOF rewrite child asks to stop sending diffs.
15:C 11 Jan 2023 00:43:35.756 * Parent agreed to stop sending diffs. Finalizing AOF...
15:C 11 Jan 2023 00:43:35.756 * Concatenating 0.00 MB of AOF diff received from parent.
15:C 11 Jan 2023 00:43:35.756 * SYNC append only file rewrite performed
15:C 11 Jan 2023 00:43:35.756 * AOF rewrite: 0 MB of memory used by copy-on-write
1:S 11 Jan 2023 00:43:35.785 * Background AOF rewrite terminated with success
1:S 11 Jan 2023 00:43:35.785 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 11 Jan 2023 00:43:35.785 * Background AOF rewrite finished successfully
1:S 11 Jan 2023 00:49:08.349 # Connection with master lost.
1:S 11 Jan 2023 00:49:08.349 * Caching the disconnected master state.
1:S 11 Jan 2023 00:49:08.349 * Reconnecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:49:08.352 * MASTER <-> REPLICA sync started
1:S 11 Jan 2023 00:49:08.352 # Error condition on socket for SYNC: Connection refused
1:S 11 Jan 2023 00:49:08.562 * Connecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:49:08.562 * MASTER <-> REPLICA sync started
1:S 11 Jan 2023 00:49:08.563 # Error condition on socket for SYNC: Connection refused
1:S 11 Jan 2023 00:49:09.565 * Connecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:49:09.566 * MASTER <-> REPLICA sync started
1:signal-handler (1673398171) Received SIGTERM scheduling shutdown...
1:S 11 Jan 2023 00:49:31.829 # User requested shutdown...
1:S 11 Jan 2023 00:49:31.829 * Calling fsync() on the AOF file.
1:S 11 Jan 2023 00:49:31.829 # Redis is now ready to exit, bye bye...
Next Replica that respwaned
1:C 11 Jan 2023 00:49:33.799 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 11 Jan 2023 00:49:33.799 # Redis version=6.2.4, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 11 Jan 2023 00:49:33.799 # Configuration loaded
1:S 11 Jan 2023 00:49:33.800 * monotonic clock: POSIX clock_gettime
1:S 11 Jan 2023 00:49:33.800 * Running mode=standalone, port=6379.
1:S 11 Jan 2023 00:49:33.800 # Server initialized
1:S 11 Jan 2023 00:49:33.800 * Reading RDB preamble from AOF file...
1:S 11 Jan 2023 00:49:33.800 * Loading RDB produced by version 6.2.4
1:S 11 Jan 2023 00:49:33.800 * RDB age 358 seconds
1:S 11 Jan 2023 00:49:33.800 * RDB memory usage when created 3.81 Mb
1:S 11 Jan 2023 00:49:33.800 * RDB has an AOF tail
1:S 11 Jan 2023 00:49:33.807 * Reading the remaining AOF tail...
1:S 11 Jan 2023 00:49:33.807 * DB loaded from append only file: 0.007 seconds
1:S 11 Jan 2023 00:49:33.807 * Ready to accept connections
1:S 11 Jan 2023 00:49:34.709 * Connecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:49:34.722 # Unable to connect to MASTER: Invalid argument
1:S 11 Jan 2023 00:49:35.724 * Connecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:49:35.739 # Unable to connect to MASTER: Invalid argument
1:S 11 Jan 2023 00:49:36.741 * Connecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:49:36.753 # Unable to connect to MASTER: Invalid argument
1:S 11 Jan 2023 00:49:37.755 * Connecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:49:37.769 # Unable to connect to MASTER: Invalid argument
1:S 11 Jan 2023 00:49:38.771 * Connecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:49:38.784 # Unable to connect to MASTER: Invalid argument
1:S 11 Jan 2023 00:49:39.786 * Connecting to MASTER redis-master-0.redis-headless.api.svc.cluster.local:6379
1:S 11 Jan 2023 00:49:39.786 * MASTER <-> REPLICA sync started
1:S 11 Jan 2023 00:49:39.788 * Non blocking connect for SYNC fired the event.
1:S 11 Jan 2023 00:49:39.788 * Master replied to PING, replication can continue...
1:S 11 Jan 2023 00:49:39.788 * Partial resynchronization not possible (no cached master)
1:S 11 Jan 2023 00:49:39.789 * Full resync from master: ac744f2fddf7b2617c689627592f95e4c0307923:0
1:S 11 Jan 2023 00:49:39.856 * MASTER <-> REPLICA sync: receiving 1203245 bytes from master to disk
1:S 11 Jan 2023 00:49:39.857 * MASTER <-> REPLICA sync: Flushing old data
1:S 11 Jan 2023 00:49:39.859 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 11 Jan 2023 00:49:39.878 * Loading RDB produced by version 6.2.4
1:S 11 Jan 2023 00:49:39.878 * RDB age 0 seconds
1:S 11 Jan 2023 00:49:39.878 * RDB memory usage when created 3.88 Mb
1:S 11 Jan 2023 00:49:39.884 * MASTER <-> REPLICA sync: Finished with success
1:S 11 Jan 2023 00:49:39.885 * Background append only file rewriting started by pid 12
1:S 11 Jan 2023 00:49:39.927 * AOF rewrite child asks to stop sending diffs.
12:C 11 Jan 2023 00:49:39.927 * Parent agreed to stop sending diffs. Finalizing AOF...
12:C 11 Jan 2023 00:49:39.927 * Concatenating 0.00 MB of AOF diff received from parent.
12:C 11 Jan 2023 00:49:39.927 * SYNC append only file rewrite performed
12:C 11 Jan 2023 00:49:39.928 * AOF rewrite: 0 MB of memory used by copy-on-write
1:S 11 Jan 2023 00:49:39.988 * Background AOF rewrite terminated with success
1:S 11 Jan 2023 00:49:39.988 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 11 Jan 2023 00:49:39.988 * Background AOF rewrite finished successfully
1:S 17 Jan 2023 07:00:43.442 * Starting automatic rewriting of AOF on 5477% growth
1:S 17 Jan 2023 07:00:43.443 * Background append only file rewriting started by pid 3049578
1:S 17 Jan 2023 07:00:43.490 * AOF rewrite child asks to stop sending diffs.
3049578:C 17 Jan 2023 07:00:43.490 * Parent agreed to stop sending diffs. Finalizing AOF...
3049578:C 17 Jan 2023 07:00:43.490 * Concatenating 0.00 MB of AOF diff received from parent.
3049578:C 17 Jan 2023 07:00:43.490 * SYNC append only file rewrite performed
3049578:C 17 Jan 2023 07:00:43.491 * AOF rewrite: 0 MB of memory used by copy-on-write
1:S 17 Jan 2023 07:00:43.543 * Background AOF rewrite terminated with success
1:S 17 Jan 2023 07:00:43.543 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 17 Jan 2023 07:00:43.543 * Background AOF rewrite finished successfully
Redis Master Log
1:M 10 Jan 2023 08:53:30.750 * Starting automatic rewriting of AOF on 2146% growth
1:M 10 Jan 2023 08:53:30.751 * Background append only file rewriting started by pid 802510
1:M 10 Jan 2023 08:53:30.790 * AOF rewrite child asks to stop sending diffs.
802510:C 10 Jan 2023 08:53:30.790 * Parent agreed to stop sending diffs. Finalizing AOF...
802510:C 10 Jan 2023 08:53:30.790 * Concatenating 0.00 MB of AOF diff received from parent.
802510:C 10 Jan 2023 08:53:30.790 * SYNC append only file rewrite performed
802510:C 10 Jan 2023 08:53:30.790 * AOF rewrite: 0 MB of memory used by copy-on-write
1:M 10 Jan 2023 08:53:30.852 * Background AOF rewrite terminated with success
1:M 10 Jan 2023 08:53:30.852 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:M 10 Jan 2023 08:53:30.852 * Background AOF rewrite finished successfully
1:M 11 Jan 2023 00:40:03.772 # Connection with replica 10.52.0.18:6379 lost.
1:M 11 Jan 2023 00:40:22.714 * Replica 10.52.2.144:6379 asks for synchronization
1:M 11 Jan 2023 00:40:22.714 * Full resync requested by replica 10.52.2.144:6379
1:M 11 Jan 2023 00:40:22.714 * Starting BGSAVE for SYNC with target: disk
1:M 11 Jan 2023 00:40:22.714 * Background saving started by pid 1008996
1008996:C 11 Jan 2023 00:40:22.736 * DB saved on disk
1008996:C 11 Jan 2023 00:40:22.736 * RDB: 0 MB of memory used by copy-on-write
1:M 11 Jan 2023 00:40:22.777 * Background saving terminated with success
1:M 11 Jan 2023 00:40:22.778 * Synchronization with replica 10.52.2.144:6379 succeeded
1:M 11 Jan 2023 00:43:13.518 # Connection with replica 10.52.2.144:6379 lost.
1:M 11 Jan 2023 00:43:34.213 * Replica 10.52.1.25:6379 asks for synchronization
1:M 11 Jan 2023 00:43:34.213 * Full resync requested by replica 10.52.1.25:6379
1:M 11 Jan 2023 00:43:34.213 * Starting BGSAVE for SYNC with target: disk
1:M 11 Jan 2023 00:43:34.214 * Background saving started by pid 1009685
1009685:C 11 Jan 2023 00:43:34.231 * DB saved on disk
1009685:C 11 Jan 2023 00:43:34.232 * RDB: 0 MB of memory used by copy-on-write
1:M 11 Jan 2023 00:43:34.325 * Background saving terminated with success
1:M 11 Jan 2023 00:43:34.327 * Synchronization with replica 10.52.1.25:6379 succeeded
1:signal-handler (1673398148) Received SIGTERM scheduling shutdown...
1:M 11 Jan 2023 00:49:08.347 # User requested shutdown...
1:M 11 Jan 2023 00:49:08.347 * Calling fsync() on the AOF file.
1:M 11 Jan 2023 00:49:08.347 # Redis is now ready to exit, bye bye...
1:C 11 Jan 2023 00:49:31.024 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 11 Jan 2023 00:49:31.024 # Redis version=6.2.4, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 11 Jan 2023 00:49:31.024 # Configuration loaded
1:M 11 Jan 2023 00:49:31.025 * monotonic clock: POSIX clock_gettime
1:M 11 Jan 2023 00:49:31.025 * Running mode=standalone, port=6379.
1:M 11 Jan 2023 00:49:31.025 # Server initialized
1:M 11 Jan 2023 00:49:31.044 * Reading RDB preamble from AOF file...
1:M 11 Jan 2023 00:49:31.044 * Loading RDB produced by version 6.2.4
1:M 11 Jan 2023 00:49:31.044 * RDB age 57361 seconds
1:M 11 Jan 2023 00:49:31.044 * RDB memory usage when created 5.61 Mb
1:M 11 Jan 2023 00:49:31.044 * RDB has an AOF tail
1:M 11 Jan 2023 00:49:31.093 * Reading the remaining AOF tail...
1:M 11 Jan 2023 00:49:31.824 * DB loaded from append only file: 0.799 seconds
1:M 11 Jan 2023 00:49:31.825 * Ready to accept connections
1:M 11 Jan 2023 00:49:39.788 * Replica 10.52.1.25:6379 asks for synchronization
1:M 11 Jan 2023 00:49:39.788 * Full resync requested by replica 10.52.1.25:6379
1:M 11 Jan 2023 00:49:39.788 * Replication backlog created, my new replication IDs are 'ac744f2fddf7b2617c689627592f95e4c0307923' and '0000000000000000000000000000000000000000'
1:M 11 Jan 2023 00:49:39.788 * Starting BGSAVE for SYNC with target: disk
1:M 11 Jan 2023 00:49:39.789 * Background saving started by pid 32
32:C 11 Jan 2023 00:49:39.818 * DB saved on disk
32:C 11 Jan 2023 00:49:39.818 * RDB: 0 MB of memory used by copy-on-write
1:M 11 Jan 2023 00:49:39.855 * Background saving terminated with success
1:M 11 Jan 2023 00:49:39.857 * Synchronization with replica 10.52.1.25:6379 succeeded
1:M 11 Jan 2023 00:49:40.242 * Replica 10.52.0.16:6379 asks for synchronization
1:M 11 Jan 2023 00:49:40.242 * Full resync requested by replica 10.52.0.16:6379
1:M 11 Jan 2023 00:49:40.242 * Starting BGSAVE for SYNC with target: disk
1:M 11 Jan 2023 00:49:40.242 * Background saving started by pid 33
33:C 11 Jan 2023 00:49:40.260 * DB saved on disk
33:C 11 Jan 2023 00:49:40.260 * RDB: 0 MB of memory used by copy-on-write
1:M 11 Jan 2023 00:49:40.358 * Background saving terminated with success
1:M 11 Jan 2023 00:49:40.359 * Synchronization with replica 10.52.0.16:6379 succeeded