https://github.com/redis/redis/pull/11012 introduced a refactor to the redis blocking infrastructure which enables reprocessing commands blocked on keys, once these keys are signaled as ready. However this refactor still did not address some other potential improvement points:

  1. Timeout handling when multi blocked. as part of the refactor introduced in https://github.com/redis/redis/pull/11012, client unblocked after waiting on some key/s to be ready will attempt to reprocess the command. however the current infrastructure will not handle correctly cases in which the client will be blocked again while attempting command reprocessing. For example the timeout should be handled correctly in such cases (once a client is blocked again during reprocessing it should re-register itself with the delta timeout letf).

  2. We have many types of blocking commands each require some specific handling:

  3. replication based blocking (wait) requires keeping the current offset and the number of replicas to wait for. When the wait block starts the client is added also to the list of client pending acks which is processed whenever a replica responds with her current offset. Also whenever the wait command timedout it is required to respond with the number of replicas synced to the relevant offset.
  4. client postpone which is used for module context commands or pause actions. When a client is block for reprocessing due to being postpone, it is also added to a list of postponed clients, but it is again required to remove it from the list once the client is unblocked.
  5. commands key based locking - is used in case of blocking commands (list, zset and stream blocking commands). in this case it is usually required to add the client on the db specific dictionary of blocking_keys but also unregister all registrations once the client is unblocked due to at least one key being ready.
  6. Module blocking is different. Modules are using callback registrations for unblock(reply) and timeout which usually do not reprocess the command, but rather call the specific callback when the module client is unblocked.

  7. Currently it is not possible for some new implementations to set some logic whenever a key is ready, aside from blocking the client for keys. However there are some cases in which module or new proposed implementation would basically like to execute some logic whenever a key is ready and not block the client execution. example cases are: this and this

Suggestion to refactor I would like to propose a high level change to the current blocking infrastructure, to better handle the issues outlined above. Although this is a high level plan and some details is missing I would like to provide a general overview. the new suggestion is proposed with a very similar structure to the module blocking. This way we can better use the blocking infrastructure to implement the complicated module blocking infrastructure:

separate ready key infrastructure from the blocking infrastructure. I suggest to separate the key signaling infra from the blocking infra and provide a new API for registering callback to be called whenever a key is ready. we can add the following functionality:

void registerForReadyKeys(
      client *c, // the relevant client  
      int db, \\ the db ID for the database for which the signal is requested.
      robj **keys,  //the array of keys to be notified when ready
      int numkeys, // the number of keys in the keys array
      mstime_t timeout, // the unix time in which the timeout_callback should be called (and the client will be unregistered)
      int notify_on_nokey, // Indicate that the unblock_callback is to be called when a key is deleted 
      Func unblock_callback, // callback to be called when the key is signaled to be ready
      Func timeout_callback // callback to be called when the timeout passed without any key becomes ready,
      void *private_data // specific data that will be passed to the callback functions
)

the blocking infra will be able to be built on top of the ready keys infrastructure to provide the same blocking service. In this case the timeout and per key registration will be made the same as is done today in blockForKeys, only that the blocking infra will use the signaling key infrastructure.

Both client blocking APIs will now accept unblock and timeout callbacks

void blockForKeys(client *c, robj **keys, int numkeys, mstime_t timeout, int unblock_on_nokey,  Func unblock_callback, Func timeout_callback, void *private_data) 
void blockClient(client *c, mstime_t timeout, int unblock_on_nokey,  Func unblock_callback, Func timeout_callback, void *private_data)

However in order to manage the timeout correctly across potential multiple blocking attempts, the original client timeout will be kept in the client blockingState and will be used on reattempts to process the command until the client is either able to complete processing the command or a timeout has occurred.

We will also drop the concept of "blocking type" as there will be no need to save the specific type of blocking since the block/unblock logic can be directed by the callback functions.