Redis Improved memory efficiency for hashmap encoded sets

Redis sets that are encoded as a hashmap will have a value that always points to NULL, since we only use the keys. Ideally we would implement a custom data structure or encoding method so that we don't need to pay those extra 8 bytes for each entry within the set. This currently introduces about a 25% additional metadata overhead per key. (Dict entries are 24 bytes + ~8 bytes of overhead for the hash buckets).

Comment From: zuiderkwast

Dicts in general are very wasteful. They have 4 pointers (32B) overhead per entry if there are no collisions. Add to this the sds header for each key and value.

When a hash encoded as listpack is converted to dict, say keys and values of size 2 (overhead 2B per listpack element), the total memory consumption per entry grows from 8B to 38B per entry, which is a factor of 4.75. If the keys and values are of size 8 (more realistic) we get a factor of 2.5.

What if we can improve the memory efficiency also for medium-large dicts (hashes/sets/zsets) by replacing the chain of dictEntry with a small listpack or something similar? If there are no collisions, this listpack will only have two elements (key and value) and if there are collisions, these are added in the same listpack. We can compact the dict further by allowing more entries per bucket and use a smaller hashtable...

Comment From: oranagra

i don't like too many mutations, but i do like this idea since it means that the dict features like SCAN and re-hashing are unchanged. Maybe the simple form of that is enough, which is to store cases of a single key per bucket as a listpack, and anything with more than a single element as a linked list of dictEntry. We can even encode the boolean flag that decides between a listpack and a dictEntry list, in the LSB of the pointer.

The main complication in all of this IMHO is gonna be the fact that many places in redis work with dictEntry directly. but maybe if we apply this feature at first just to sets and later hashes (without applying it to the main key dictionary), it would be easier to integrate

Comment From: zuiderkwast

Another (simpler) idea: We can have a 'no-values' flag in the dictType. If dictEntry is made opaque, the dict functions can handle entries without values.

Dict with only keys (no values) can also be useful in the main DB dict if we decide to move the key to robj (along with expire). One benefit is that the expire dict (or rax) can point directly to the robj.

Comment From: oranagra

i still like the listpack idea better. i don't know if we'll move the value and expiry to the robj anytime soon, or that we'll move to rax (too many issues with SCAN etc), so although simpler, if we do that trick only for set type keys, it may be a wasted effort if we also end up going the listpack direction.

for small field names and values, this listpack idea is seems great, and for bigger ones the dictEntry overhead isn't an issue.

Comment From: zuiderkwast

Right, for sets and hash we can probably do listpack. Nice that you believe in this idea. Perhaps I should prioritize it then.

But for the maib db dict we need value be a pointer to robj. I want to get rid of the next pointer and possibly group the entries in the same bucket in one allocation (same cache line). Maybe I'll try this first since it seems easier. :)

Rax for expire seems great IMO, grouped in segments of similar expire values. Not for main db dict. See Madelyn's diagrams in #10802 regarding this.