Redis redis 2.8 psync - Nineya|java/go/python

the slave log print "[[26451] 26 Nov 16:07:24.555 * Successful partial resynchronization with master. [26451] 26 Nov 16:07:24.555 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization. [26451] 26 Nov 16:07:24.878 * Caching the disconnected master state. [26451] 26 Nov 16:07:25.555 * Connecting to MASTER 10.131.100.21:6666 [26451] 26 Nov 16:07:25.555 * MASTER <-> SLAVE sync started [26451] 26 Nov 16:07:25.555 * Non blocking connect for SYNC fired the event. [26451] 26 Nov 16:07:25.555 * Master replied to PING, replication can continue... [26451] 26 Nov 16:07:25.555 * Trying a partial resynchronization (request c1a62846733a92943a6e415ec4471cd150be61e7:450000263). [26451] 26 Nov 16:07:25.556 * Successful partial resynchronization with master. [26451] 26 Nov 16:07:25.556 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization. [26451] 26 Nov 16:07:25.878 * Caching the disconnected master state. [26451] 26 Nov 16:07:26.557 * Connecting to MASTER 10.131.100.21:6666 [26451] 26 Nov 16:07:26.557 * MASTER <-> SLAVE sync started [26451] 26 Nov 16:07:26.557 * Non blocking connect for SYNC fired the event. [26451] 26 Nov 16:07:26.557 * Master replied to PING, replication can continue... [26451] 26 Nov 16:07:26.557 * Trying a partial resynchronization (request c1a62846733a92943a6e415ec4471cd150be61e7:450000263).." forever

can you help me? My operation is as follows 1、master---slave（repl-backlog-size 512mb、repl-backlog-ttl 0） 2、on slave /sbin/iptables -A INPUT -s 10.131.100.21 -p tcp -j DROP 3、wait 3 min 4、/sbin/iptables -D INPUT 1 5、error start

Comment From: antirez

Hello, please could send the master logs as well? The answer is probably there.

Comment From: sqlcat

log file content: "[20077] 26 Nov 14:44:01.304 # Client addr=10.16.15.165:33907 fd=27 name= age=0 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=1 omem=482345000 events=rw cmd=psync scheduled to be closed ASAP for overcoming of output buffer limits. [20077] 26 Nov 14:44:01.304 * Partial resynchronization request accepted. Sending 481685735 bytes of backlog starting from offset 53761426. [20077] 26 Nov 14:44:01.562 * Slave asks for synchronization" forever

Comment From: antirez

Hello, the master log file clearly states the problem: scheduled to be closed ASAP for overcoming of output buffer limits.

You should check the master configuration and enlarge (or disable) the output buffer limit for slaves (see redis.conf). And the problem will go away.

Cheers, Salvatore

Comment From: sqlcat

set "client-output-buffer-limit slave 0 0 0 ", every thing is ok,3ks!

Comment From: antirez

Great! Closing the issue.

Comment From: hakkiyagiz

same issue on my production env. thank you

Comment From: watchpoints

set the client output buffer client-output-buffer-limit slave

Comment From: myysophia

I have a question, both of the parm 'client-output-buffer-limit slave' and 'repl-backlog-size' difference? 3Q @antirez

Comment From: myysophia

the ‘client-output-buffer-limit ’ effect the sysc or command propagate？ I appreciate。

Comment From: oranagra

@myysophia i'm not sure i understand your question. The replica (slave) buffers are allocated when the master executes write commands, and they keep growing if the replica doesn't read them. the typical case where they grow a lot is during replication full-sync in which these buffers grow while generating / processing the RDB portion of the full sync. the setting controls the limit after which the connection is dropped by the master. the buffers are kept per-replica, only while the replica is connected. The replication backlog on the other hand is a single, small pre-allocated buffer that holds the last bytes that were written to the replication stream, even if there are no replicas connected, it is used in order to allow a replica that disconnected for a short period of time to re-connect using partial sync (PSYNC), rather than doing a full-sync.

Comment From: myysophia

for example： slave log 26745:S 13 Jun 12:42:46.658 * The server is now ready to accept connections on port 6379 26745:S 13 Jun 12:42:46.658 * Connecting to MASTER 10.50.10.8:6379 26745:S 13 Jun 12:42:46.659 * MASTER <-> SLAVE sync started 26745:S 13 Jun 12:42:46.678 * Non blocking connect for SYNC fired the event. 26745:S 13 Jun 12:42:46.691 * Master replied to PING, replication can continue... 26745:S 13 Jun 12:42:46.711 * Partial resynchronization not possible (no cached master) 26745:S 13 Jun 12:42:46.872 * Full resync from master: d50a53d5f6e9e45e16a368e3f1a7f8429738379b:37683397 26745:S 13 Jun 12:44:16.827 # Timeout receiving bulk data from MASTER... If the problem persists try to set the 'repl-timeout' parameter in redis.conf to a larger value. 26745:S 13 Jun 12:44:16.827 * Connecting to MASTER 10.50.10.8:6379 26745:S 13 Jun 12:44:16.827 * MASTER <-> SLAVE sync started 26745:S 13 Jun 12:44:16.830 * Non blocking connect for SYNC fired the event. 26745:S 13 Jun 12:45:14.171 * Master replied to PING, replication can continue... 26745:S 13 Jun 12:45:14.186 * Partial resynchronization not possible (no cached master) 26745:S 13 Jun 12:45:14.502 * Full resync from master: d50a53d5f6e9e45e16a368e3f1a7f8429738379b:37922331 26745:S 13 Jun 12:47:33.222 * MASTER <-> SLAVE sync: receiving 7945675141 bytes from master 26745:S 13 Jun 12:52:34.249 * MASTER <-> SLAVE sync: Flushing old data 26745:S 13 Jun 12:52:34.250 * MASTER <-> SLAVE sync: Loading DB in memory 26745:S 13 Jun 13:08:36.450 * MASTER <-> SLAVE sync: Finished with success 26745:S 13 Jun 13:08:36.864 * Background append only file rewriting started by pid 29078 26745:S 13 Jun 13:10:03.066 * AOF rewrite child asks to stop sending diffs. 29078:C 13 Jun 13:10:03.066 * Parent agreed to stop sending diffs. Finalizing AOF... 29078:C 13 Jun 13:10:03.066 * Concatenating 18.95 MB of AOF diff received from parent. 29078:C 13 Jun 13:10:03.114 * SYNC append only file rewrite performed 29078:C 13 Jun 13:10:03.333 * AOF rewrite: 4506 MB of memory used by copy-on-write 26745:S 13 Jun 13:10:03.720 * Background AOF rewrite terminated with success 26745:S 13 Jun 13:10:03.720 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 26745:S 13 Jun 13:10:03.720 * Background AOF rewrite finished successfully

master log `38017:M 13 Jun 12:39:17.755 * Slave 10.50.10.9:6379 asks for synchronization 38017:M 13 Jun 12:39:17.755 * Full resync requested by slave 10.50.10.9:6379 38017:M 13 Jun 12:39:17.755 * Starting BGSAVE for SYNC with target: disk 38017:M 13 Jun 12:39:17.896 * Background saving started by pid 43154 43154:C 13 Jun 12:41:44.643 * DB saved on disk 43154:C 13 Jun 12:41:44.695 * RDB: 10148 MB of memory used by copy-on-write 38017:M 13 Jun 12:41:45.181 * Background saving terminated with success

38017:M 13 Jun 12:41:45.184 # Connection with slave 10.50.10.9:6379 lost. 38017:M 13 Jun 12:41:45.227 * Slave 10.50.10.9:6379 asks for synchronization 38017:M 13 Jun 12:41:45.227 * Full resync requested by slave 10.50.10.9:6379 38017:M 13 Jun 12:41:45.228 * Starting BGSAVE for SYNC with target: disk 38017:M 13 Jun 12:41:45.538 * Background saving started by pid 43380 38017:M 13 Jun 12:42:22.078 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 38017:M 13 Jun 12:43:29.079 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 38017:M 13 Jun 12:44:00.041 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 43380:C 13 Jun 12:44:03.731 * DB saved on disk 43380:C 13 Jun 12:44:03.902 * RDB: 4107 MB of memory used by copy-on-write 38017:M 13 Jun 12:44:04.254 * Background saving terminated with success 38017:M 13 Jun 12:49:05.243 * Synchronization with slave 10.50.10.9:6379 succeeded ` my server's config： 1、 client-output-buffer-limit slave slave 1gb 256mb 120 2、repl-backlog-size 1mb

Counld you give me some advice？3Q@oranagra

Comment From: oranagra

@myysophia i don't see any problem in your logs (sync succeeded, after some advise about increasing repl-timeout). anyway, assuming it's not really related to this very old issue, let's not bother all the followers, and open a new one with more details about what your problem is.

Comment From: myysophia

@myysophia i don't see any problem in your logs (sync succeeded, after some advise about increasing repl-timeout). anyway, assuming it's not really related to this very old issue, let's not bother all the followers, and open a new one with more details about what your problem is.

ok,3q :)