I did some investigation on Redis source code 4.0 while I was doing my job. Something about sdshdr5 and sdshdr8 storage had raised my curiosity and something really confused me popped up. Here are the steps to reproduce the scenario:

Open redis cli Type SET key value So here from my perspective and observation, THE value has been stored with sdshdr8 and THE key has been stored with sdshdr5(through dbAdd->sdsdup) I was guessing with the help of MEMORY USAGE, THE key should be analysed as sdshdr5, while THE key was passed via c->argv[2]->ptr, it will be first encapsulated with method createdEmbeddeStringObject which adopt sdshdr8 as the struct type. So it will be analysed as sdshdr8 rather than sdshdr5. So, could anyone explain what is the actual reason for this and will this affect the accuracy of MEMORY USAGE command? Also, there is another question which might sound silly. Why we use 1<<5 (32) as one of the threshold for struct usage variation? Appreciated.

Comment From: oranagra

@houximing you're looking at old code. it was fixed in d56c631343 the reason for 32 is that we had unused bits in the sds type byte, so we can save 2 extra bytes in the sds header this way.

Comment From: houximing

@oranagra Appreciated for that. I actually find another place where I do not quite clear, which appears in the latest tag version. It shows in method objectComputeSize in object.c and when it calculates size of the embeddedstring object, it uses statement as follows: else if(o->encoding == OBJ_ENCODING_EMBSTR) { asize = sdslen(o->ptr)+2+sizeof(*o); } As I understand the threshold for embedded string and raw string object is 44, which should use sdshdr8 as its sds struct type, so it should have 3 bytes of header plus 1 byte of \0, which is 4. So here why it is +2? Thank you and I am looking forward to hearing from you.

Comment From: oranagra

@houximing looks like you're right.

robj *createEmbeddedStringObject(const char *ptr, size_t len) {
    robj *o = zmalloc(sizeof(robj)+sizeof(struct sdshdr8)+len+1);

but in fact, fixing the calculation to include the missing 2 bytes is not enough. what we want to do is call getStringObjectSdsUsedMemory, which uses zmalloc_size. the thing is that INFO used_memory shows memory usage that includes internal fragmentation, so MEMORY USAGE should do the same. currently some of the calculations in that function include internal fragmentation (i.e use zmalloc_size) and others don't. this needs to be fixed as was recently agreed in #6198

Comment From: houximing

@oranagra Thank you, so as I can observe in #6198, skiplist was fixed based on zmalloc_size, while for String type (e.g. SET KEY VALUE), when creating embeddedStringObject, it uses zmalloc to allocate memory based on memory alignment principle while when calculating its memory usage with "MEMORY USAGE", it seems it does not take this into account, so is it something that are already known? Thank you.

Comment From: oranagra

@houximing yes it is known... I just told you about it in my last message, and it was discussed in the PR I referred you to. (this "memory alignment principle" is called internal fragmentation)

Comment From: houximing

@oranagra Thank you :)