Most of the usage of redis stream is append only, current implementation of listpack is not efficient for this. My profile shows that the zrealloc happened in lpInsert is a clear hotspot. From my understanding, for every value of every xadd command, there is a zrealloc call.
Maybe we can preallocate some memory for listpack and store used bytes in header? We can ignore the unused trailing when serializing. Does it have any negative effects I'm not aware of?
945.00 ms 20.2% 53.00 ms streamAppendItem
689.00 ms 14.7% 90.00 ms lpInsert
499.00 ms 10.7% 59.00 ms zrealloc