Redis Tracking invalidations order is not consistent across commands

This is apparent when invalidation messages are sent on the same connection, as they may be received before or after the mutating command's reply.

For example, HSET delivers the reply first:

hget h f
_
hset h f 1
:1
>2
$10
invalidate
*1
$1
h

On the other hand, HDEL will deliver the invalidation first:

hget h f
$1
1
hdel h f
>2
$10
invalidate
*1
$1
h
:1

It seems the order of producing the reply, calling signalModifiedKey and creating a keyspace notification (which is also affected) is arbitrary. I'm not sure if the order makes a difference here but it should probably be consistent. @redis/core-team any thoughts?

Thanks @michael-grunder for bringing this up (and fixing hiredis to handle that)!

Comment From: oranagra

i agree. I think signalKeyModified should be always called after the modification.

Comment From: yossigo

@oranagra just to clarify, the issue is not the modification itself but the reply.

Comment From: oranagra

sorry, my bad. it's probably right that signalModifiedKey will be after modification (together with keyspace notification which issues a callback for modules). and it's probably right to also do it after the reply (many times the reply is composed inside a loop and only at the end we know what we did).

Comment From: madolson

I agree, the signalModifiedKey should always be after the key is modified.

Comment From: itamarhaber

Agreed. And even if it doesn't make sense for some reason, we should at least be consistent (in regards to when we send the invalidation).

Comment From: soloestoy

I agree, BTW we have lots of commands send invalidation before real reply, even like SET command...

Comment From: michael-grunder

Just for reference, here's the fixed logic in hiredis.

Comment From: yossigo

@madolson signalModifiedKey always gets called after the key is modified, but not consistently before or after the reply.

I had a closer look and in some cases it's a bit more difficult to solve because signalModifiedKey and replies are interleaved. The only way I see we can solve it with reasonable amount of refactoring is to create some scheduleSignalModifiedKey which will hold a queue and process it after all replies have been handled.

I wonder if anyone thinks of other problems this can address, as going through this just for the invalidation consistency is somewhat questionable.

Comment From: huangzhw

@yossigo. So where should we handle the scheduled trackingInvalidateKey. In call or beforeSleep? If in beforeSleep, there is inconsistency. If client A calls SET key value, client B calls GET key in one eventloop, before this, B should track key again, after this, not track.

Comment From: yossigo

@huangzhw I think call is the right place.

Comment From: huangzhw

@yossigo I found the problem in call. In transaction or lua (or module, I'm not sure), call may be called multiple times. So the replies may be still interleaved. So I think we should do the real trackingInvalidateKey and cleanup in processCommand after call. I'm not sure whether processCommand covers everything.

Another question is the behavior of BCAST which is handled in beforeSleep.

Comment From: madolson

It should be okay to interleave invalidation messages in between calls, since it should be syntactically correct. I think we should do it during processCommand thought to maintain transactional boundaries of that type of command.

btw, if @huangzhw if you are working on this, would you mind assigning this to yourself?

Comment From: huangzhw

@madolson I was working on it. But I encounter some problems, I stopped it. Some places like expireIfNeeded may be called in commands or beforeSleep or serverCron. How should I handle this? Distinguishing between them is a little hard.

Comment From: oranagra

i generally think we need a few concepts similar to beforeSleep to handle similar concerns. i remember recently discussing either afterClient or afterCommand with @yoav-steinberg. i.e. it was some query buffer or output buffer related maintenance job that was needed after a command, or a pipeline of commands was executed.

in this case, i don't think this job should be in processCommand since currently the job of processCommand is to perform some checks that either reject the command, or call call (when we get to call we never abort, the command is always processed and put in slowlog / stats).

So in this case the invalidation should be at the end of call, but maybe also placed in a few other places, like after processing blocked clients. i.e. an afterCommand, that's called in call and also when processing blocked clients? an additionally maybe we need some explicit flush in serverCron?

Comment From: huangzhw

@oranagra I found some problems: * Some module methds like RM_RandomKey, RM_OpenKey, moduleCloseKey, RM_SignalModifiedKey if not called in commands, we should rely on beforeSleep and serverCron? * performEvictions should also handle this after function call? * Whether module has similiar time event which should be handled?

Comment From: oranagra

yes, i think that beforeSleep and afterCommand can handle all the module API concerns, including timers, and serverCron. and yes, we need another trigger between the evictions and the command (call).

Comment From: huangzhw

For transaction, output maybe interleaved.

multi
+OK
hset h c d
+QUEUED
get a
+QUEUED
exec
*2
:1
>2
$10
invalidate
*1
$1
h
_

But I think we may have no methods to do with it.

Comment From: oranagra

First, this seems broken to me. i'm not sure if RESP3 specifies that push notifications can come inside a multi-bulk, or just between responses. the spec says:

Push data may be interleaved with any protocol data, but always at the top level, so the client will never find push data in the middle of a Map reply for instance.

so this seems broken to me. @madolson can you confirm?

secondly, i don't think it's that hard to solve. we can use a mechanism similar to fixed_time_expire, that counts the nesting level of call().

Comment From: huangzhw

If you use a mechanism similar to fixed_time_expire, there are still problems. For example:

multi
set a b
get a
exec

key a will not be tracked. In old case, it will be tracked.

Comment From: oranagra

i'm sorry, maybe i'm not familiar with the implementation details enough. I assumed Yossi's comment suggested to just postpone the sending of the push messages (without also affecting what's getting tracked and what doesn't). If that's currently coming together, i suppose we have to split it, since the problem you showed with the output of multi and push messages seems like it breaks the protocol (more severe than the title of this issue that's only about inconsistency ordering).

Comment From: huangzhw

This bug you can see https://github.com/redis/redis/issues/8935. With this scheduled signalModifiedKey, we can fix this bug but not transaction. But if we fix transaction, i.e. handle scheduled invalidate message after total transaction finished. It cause another problem above as I said. We may miss track key.

Comment From: oranagra

@huangzhw i don't see why. depends on what we wanna schedule. i'm saying that we won't schedule a call to signalModifiedKey (like is implied from Yossi's message), but rather run signalModifiedKey right away as we normally do. this will decide which keys should be invalidated and so on, but it won't push the protocol directly into the client reply buffers right away, instead it'll log somewhere which push messages should be sent, and then the beforeSleep / afterCommand will be the one that actually puts the push protocol bytes into the client output buffers.