Redis [Client Side Cashing] Communicate hashed keys, do we really need it?

The upcoming CSC suggest the invalidation message sent from the server to the client will contain s hash value of the invalidated key rather than the actual key. This should work, but it is not free and (I will try to prove) not worth it.

First lets, get out of the way the very obvious huge amount of clients that have short keys by default. For such clients any theoretical added value of hashing the key into something shorter greatly diminishes and any effort done to do so is mostly wasted overhead.

Now lets consider only the more rare clients that have many huge key size and enumerate the possible benefits of hashed key vs raw (non hashed) keys. I will try to make a point that even for the alleged benefits raw keys will preform in the same magnitude order as hashed ones or better.

So for the reminder of this discussion lets assume the harder case in which keys are large (as large as they can be) and therefore can presumably benefit greatly from hashing.

Network
Claim: invalidation message will be shorter with hashed keys. Which is (sadly :) ) true. BUT its worth remembering that clients that till today had a use pattern of X get requests each with a key of size H (stand for huge) payed a network price of XH. with CSC we can expect this to be significantly lower, even just a single X (one true GET over the network and the rest from local cache). But lets make it harder (hardest). Consider the most extreme case imaginable where hashed key might yield the biggest impact. In this case the client gets an invalidation massage after each GET forcing it to GET the key from the server every time and paying the network price for the invalidation message (every time). To make this argument more extreme lets assume hashed invalidation size is zero (free). - Raw key (no hashing) - The client will pay the original XH bytes in the GET request plus XH for X invalidation massage each with a full key of size H. 2XH network - Hashed key - The client will pay the original XH bytes in the GET request plus X0 for X invalidation massages of zero size. Total XH network. - Without CSC at all - The client will pay the original X*H bytes in the GET request.

... so at the worse case imaginable communicating full keys in invalidation massage is just twice as bad as hashed keys, and hashed keys itself is slightly worse or equal to not having CSC at all. I.e. the order of magnitude payed on the wire even in this fictitious worse case we which we deviously engineered is O(XH) (i.e. the same) Again, its worth reminding that in the usual case, both hashed keys and raw keys will have much lower network cost due to cache hits and the delta between the hashed cost and the raw cost will be much smaller than XH.

Memory usage on the server Claim: saving just the hash will save a lot of space on the server Current CSC implementation suggest having a tracking table which is an array of 2^24 "slots" the index of which is the hashed value of the key. In each slot the clients that track a key that is mapped to this hash is kept, and notified when the key invalidates.

lets distinguish between two cases and show the hashed keys is not a great idea in any of them.

Case 1: (almost) No collisions - each slot contains only one client. This is the expected case as the CSC made the tracking table big explicitly for this goal. Reminder: we already have a place to hang information per key... its the main key->value map. saving the clients tracking a key there is just as good and more intuitive. Added bonus: in the very likely case where there are fewer than 2^24 keys... this will actually take less memory space. (as we dont preserve space for the non used slots)

Case 2: either due to huge amount of keys or some very unlucky hash mapping (aka bug) there are many collision and every slot has a lot of keys mapped to it. From pure memory usage this option actually uses a lot less memory space with hashed keys because the clients list is saved per hashed key value. If we craft worse case scenario each slot has K keys mapped to it, each key being tracked C clients. - That hashed tracking table would consume only O(C) for all K keys. - Without hashing and tracking table O(CK) is consumed. which sounds bad BUT... 1. In such extreme the whole CSC system backfires in other ways... mainly upon each invalidating of on of the keys in K all the keys are invalidated and major network penalty is incurred by future cache misses. 1. If we soften the scenario to: K keys are still mapped to a single hash, but only one client track each key. Raw keys with no tracking table memory drops to O(C) (same as tracking table price) with better performance upon invalidation.

but lets do it anyway. why not? Because this feature cost a price in other aspects. 1. Communicating hashed key values with the client forces a more complicated client implementation, and at the very least memory space for mapping the hash value back to a raw key. 1. The server side tracking table code is more complex and harder to maintain compared to just hanging the tracking client information on existing keys. 1. Tracking table recommits 2^24 worth of pointers array upon the first client that tracks anything. Raw key solution uses more memory as the demands increase. (consider a smallish server, with relative few keys preallocating 2^24 array...) 1. Communicating hashed key value, defacto makes the hash function itself part of the API. It introduce un needed rigidity into the API and unhealthy coupling between the client and the server implementation. 1. (as shown) For key hashed based tracking table to have any benefits at all, collision must occur (or you gain nothing)... BUT when a slot with more than one tracking client is invalidated, other (more painful) price is payed by the overall system in the form of cache misses. 1. The server must look up the key in the main key->val dict anyway, forcing the system to do extra lookup, even just an ~O(1) lookup in a huge array is overhead.

Conclusion (at least mine): Having a tacking table of hashed keys on server side and communicating hash values with the client is not a real improvement both in networking, nor in memory. It introduces some non trivial complexity and rigidity into the system, and forces a less intuitive API with the client. OTOH even at artificially constructed worse case scenarios the more naive raw key implementation preforms in the same order of magnitude as the hashed version. Clients that have application that are "close" to these worse case scenarios are already paying for their poor usage pattern and are either OK with it, or need to improve anyway (regardless of CSC as a whole with or without hashing). For these reason I suggest implementing a simpler more intuitive raw keys API, with no tracking table on the server side (and no translation dictionary client side)

Comment From: yossigo

Hi @eliblight, just to see that I understand correctly. Are you suggesting to add another point to all keys which will point to a list (?) of all tracked clients for this key?

Comment From: antirez

Please make sure to read this thread with the following note: the BCAST mode communicates the whole key, since in that case is for free. I'll comment more in the next days, for now implementing BCAST inside Redis so that we have the full CSC design in place as of version 1.

Comment From: eliblight

Please make sure to read this thread with the following note: the BCAST mode communicates the whole key, since in that case is for free. I'll comment more in the next days, for now implementing BCAST inside Redis so that we have the full CSC design in place as of version 1.

I will need to think more about BCAST... my initial feeling is that it is completely tangent to the non BCAST discussion. i.e. it does not make any of my arguments stronger or weaker for the non BCAST design.

Comment From: madolson

Another deficiency that I see in having both the normal tracking mode and the BCAST mode is that clients will have to implement two separate protocols for understanding evictions, instead of just evicting keys. This will hurt adoption

I think this implementation can supplement the BCAST mode, while providing a more memory conservative implementation for the full key mode.

Comment From: antirez

@yossigo yep good question. I want to @eliblight to make sure to consider that any potential design will have to evaluated similarly. That is, I didn't touched the main keyspace for reasons (including decoupling and memory usage of the keys object when nothing like that is used), so memory usage must be evaluated, to start, imagining that we don't have an additional pointer in the key space. Otherwise it is not an apple to apple comparison at all. You can surely remodel your argument, saying that instead of having a linear array -> radix tree of client IDs, we could have a radix tree of keys -> radix tree of client IDs. When when there is a key explosion problem, we do what we do now, that is, to have a capped size of keys.

A good point of this design, is that we can store access informations soon or later, and when evicting keys from the CSC table, we can evict keys that are less requested. But it will use significantly more memory, we could evaluate it, and understand if anyway it's a better design. I'm starting to think that this design has some validity, but not for most of the reasons exposed above, like the hash function being part of the API or client complexity, but more for one of the reasons that Eran said about, anyway, having as a goal a limited amount of collisions.

Another thing that we can do in order to bring the memory back at the current level while STILL having the system that Eran told us, is to use a linear sorted array of IDs instead of the radix tree up to a given number of clients.

Comment From: antirez

@madolson that's not too bad because clients may just hash the key when received and use the other mechanism if they want a single implementation, btw your comment makes sense indeed. The second part does not IMHO, BCAST is still needed, is a completely different system with other tradeoffs, see: https://www.youtube.com/watch?v=TW9uFIJ9xkc&t=915s

Comment From: antirez

So, even if we switch to sending the whole key, that may be a reasonable path especially because of what I said above, that is Eran point about the fact that when we, anyway, reach collisions we have no longer much to gain, and the fact that I envisioned random eviction (and sending of the notifications) only later when I started implementing this design, that can be used in order to cap the number of keys that we remember, the BCAST mode would be still super useful, because it is a different mode where we tell Redis: don't remember anything, just send me all the notifications for all the keys with such prefixes.

Comment From: eliblight

Hi @eliblight, just to see that I understand correctly. Are you suggesting to add another point to all keys which will point to a list (?) of all tracked clients for this key?

worse :) ! i suggest keeping a pointer in each key to a set of all tracked clients clients. (dict with non relevant val field)... but list convey the idea...

Comment From: antirez

@eliblight we can't do what you suggested, but we can modify the current implementation to use just a radix tree (of key names) of radix trees (of client IDs).

Comment From: madolson

@antirez My comment might have been miscommunicated, I think BCAST is still useful, I think it solves a different problem in the same space. I believe when eran is proposing is more for replacing the the large tracking table.

This is turning into slack :D

Comment From: antirez

And to compensate for the used memory, we can use a more efficient representation of the list of IDs.

Comment From: antirez

@antirez My comment might have been miscommunicated, I think BCAST is still useful, I think it solves a different problem in the same space. I believe when eran is proposing is more for replacing the the large tracking table.

This is turning into slack :D

Oh ok, I didn't understand your comment. So we are on the same page on BCAST being useful as low-memory, high-messages mode.

Comment From: antirez

I'm now intrigued by the @eliblight main proposal of trying to send the keys. I'll evaluate the impacts considering we can't implement stuff in a much different way (I don't want to mix in any way CSC with the main dictionary implementation for many reasons). Probably it could be a good idea indeed. Let me reason about that a bit.

Comment From: antirez

However note that we have to resort to random key eviction to don't go over a maximum configured limit of keys that we store server side. But this should not hurt the feature much.

Comment From: eliblight

As far as design... I see no difference (for the sake of discussion) between adding to the current redix key->val+invalidated clients or creating a duplicate of just key->invalidated clients.

I totally agree that for the sake of software design... keeping them decoupled might make a lot of sense... and have no strong opinion either way...

Comment From: antirez

As far as design... I see no difference (for the sake of discussion) between adding to the current redix key->val+invalidated clients or creating a duplicate of just key->invalidated clients.

I totally agree that for the sake of software design... keeping them decoupled might make a lot of sense... and have no strong opinion either way...

The big difference is that decoupling you need a way to limit number of keys stored in CSC side. But deleting a random one is not a problem AFAIK, every time we reach the level.

Comment From: yossigo

I don't think the "right" design has surfaced but I think this discussion should continue, trying to reach it.

Things not to compromise: - Zero overhead when CSC is not used

Things we can compromise: - False positives mapping a key -> tracked clients - Lazy update of disconnected clients - What else?

Comment From: antirez

@yossigo well, we may not be so far from the final design, because the suggestion of @eliblight can be just taken as, TLDR, replace indexing by hash with indexing by key name, still taking a capped collection (via eviction as we had to do, anyway, even with caching slots), and send the key instead of the ID. Here the main tradeoffs are: less efficiency (hashing + indexing the slot is faster thank looking up) and more memory usage (but well, we have a way to decide the number of keys), VS a big simplification of the concepts to expose to the user (not my main concern) and better scalability (if you have memory, you can let the client cache as many keys as you want without compromising the precision of the caching because of collisions. So far I think that Eran may have a good point here and that such sacrifices could be in the right spirit. Initially the reason I avoided doing this, is because I missed the fact (realized later and implemented) that I could evict stuff by sending non-real eviction messages to clients.

Comment From: yossigo

Using key names on the wire makes a lot of sense as it simplifies and mainly decouples the client and server implementations. We do lose the theoretical benefit of multi-key invalidation which still bothers me a bit.

On the server side, one thing that still bothers me is we can cap the overhead that is derived from the number of tracked keys, but not from the number of clients.

We could add another level of indirection between tracked keys and clients, so a key is not mapped to a specific tracking client but to a slot of clients. That would result with false positive invalidation (i.e. client receives an invalidate for a key it never tracked) of course.

Comment From: antirez

On the server side, one thing that still bothers me is we can cap the overhead that is derived from the number of tracked keys, but not from the number of clients.

I think we are already able to have a maximum cap on the number of objects, even if I tend to believe we don't need that. Right now, we'll initially cap the number of keys we track, but as we touch the keys, to have an estimate of the number of average client IDs per key is not hard to calculate. Based on that, we can later (if really needed) cap the number of objects, either by expiring more keys depending on the amount of data they hold, or alternatively, by expiring IDs from the keys, and sending invalidations only to a selected amount of clients.

The reason I believe this is not going to be useful, is that actually CSC is a distributed system that involves clients and whose efficiency has a lot to do with access pattern in the whole data space: users will have to tune the tradeoff of memory they want to spend in the server side VS the latency they want to obtain based on their exact work load. For instance in heavy write workloads the expires will keep coming and the server will have the opposite problem: not being able to track anything and flushing continuously the key->IDs map and sending invalidation messages.

So people that know to have N clients doing caching, will be probably able to select decent initial configurations just selecting the maximum number of keys, and from that they can adjust the value to tune for their use case. However it is a good idea if CSC gets some observability ASAP, that is, would be great to take, for instance, the exact count of the tracked IDs in total, and is probably trivial to do.

Comment From: eliblight

A note about capping memory by randomly (or in some more clever strategy) throwing things and invalidating them is that i can imagine scenarios where this will aggravate the client to immediately demand the key again (i.e. GET), which will trigger another eviction on the server side, which will trigger... (... turtles all the way down). I have no good suggestion for this client<->server trashing behavior, just a heads up warning.

The only setup in which I see a solution is in a new specialized GET_TRACKED_KEY command (suggested in the TTL discussion)... if we had such command it could fail if the server is crossing some redline threshold, forcing the client to do a non tracked GET (and be aware it should not cache it).

as said I realize this is less clean / nice than using the current GET command, but i have nothing better to offer at the moment.

Comment From: antirez

@eliblight one way in order to alleviate the problem you describe, for which there is basically no solution, is the following: detect keys that have a very short client-side ability to live cached, and don't cache them for some time. This is simple but requires client smartness, however here is the client that drives the whole protocol, so it makes totally sense that it is able to avoid caching the wrong values based on simple statistic take in the client side.

Comment From: antirez

There is an implementation using keys instead of hashes in the crc2 branch here.

Comment From: madolson

This was eventually implemented.