I have a doubt, suppose we are using aof, with fsync policy is no. Means OS will do fsync on file descriptor based on its own setting. Now for replication, it maintains a memory buffer and from there based on slave's last synced offset, slave does a PSYNC. As redis streams all events to slave as it receives, so it might be possible that redis streamed 1 write query to a slave successfully and then box failed ( so all OS buffers in fd are also lost) . Master came up before slave is selected as new master. So master will replay everything in aof file, which has less events, while slave has more. How does redis handles such scenario.

Comment From: yossigo

This is correct, and is derived from the consistency model adopted by Redis that generally promotes performance and availability where a hard tradeoff is required.

Referring to your specific example: The only way not to lose those writes would be for the master to remain down (not accepting reasd/writes) until the replica is up and some form of consensus can be reached about who is more up to date. This would of course reduce the availability of the system in this case, while not strictly providing strong consistency as replication is anyway async.

Applications that require strong consistency may consider the RedisRaft project which aims to provide a strongly consistent Redis deployment option (but is still under development).

Comment From: rajatgoyal247341

@yossigo so what do we do in current redis implementation in case slave is ahead of master. Does master tell to slave to remove few events from its database as we always keep slave <= master and never more than master.

Comment From: yossigo

@rajatgoyal247341 As soon as the slave establishes a replication connection with the master, it will perform full sync and overwrite its previous dataset (including rewriting RDB or AOF file if configured).

Comment From: rajatgoyal247341

@yossigo but this time, it will do only Partial sync and not the full sync. So PSYNC for a given replication-id which will remain same as no failover happend.

Comment From: yossigo

@rajatgoyal247341 In this case it should not do PSYNC and should full back to full synchronization. If that doesn't happen it's a bug...