Redis how to reduce mem_fragmentation_ratio

Hi guys I have a annoying problem . I use redis as a kv db, we have really large data. The result show by info command as follow: used_memory:72373182192 used_memory_human:67.40G used_memory_rss:120811458560 used_memory_peak:73495421496 used_memory_peak_human:68.45G mem_fragmentation_ratio:1.67 mem_allocator:jemalloc-2.2.5 The redis version is Redis server version 2.4.14 (00000000:0) The mem_fragmentation_ratio is too high to be put in memory. since the machine get neary 100G memory. I have tried to reboot the redis , it make sense, but I think it is not a good idea to restart redis intervally Is the problem of mem_allocotor or other? Is this problem been fixed in newer version? Thanks Jing

Comment From: antirez

Hello, it is the first time after a long time I see what appears to be an actual fragmentation issue. What is the average size of data you store? Probably you store objects that are bigger than 4k, and the allocation classes of jemalloc are not granular enough.

Another problem could be with the Redis version you are using that is too old, the new version of Redis 2.6.x includes an update version of Jemalloc that could (or could not) perform better under this conditions.

The first step is definitely to upgrade Redis to latest version and see if this makes a difference.

Cheers, Salvatore

Comment From: antirez

p.s. also some info about the commands you use more could be helpful as well... Btw now I'm realizing that 2.6 has other changes as well, not just jemalloc, that could improve your experience especially if you use the APPEND command.

Comment From: zhangjing

Thanks , antirez. For it's a production enviroment, last night. I just restart redis, the amount of used memory shrinked to 70G. I think I will use 2.6.x next after some tests
"Probably you store objects that are bigger than 4k" : the size of objects can be divied into two parts, half of them are very small ,nearly 100B, the other half may a litter bigger than 4k . I am not very sure The command I used is limited to set , get , expire

There is another problem I sometimes come cross, when i trigger slaveof command as we know , master will dump a rdb snapshot ,then transfer it to slave but sometimes I find the master will dump not only once , maybe twice , even more

Look into the log of slave : Timeout receiving bulk data from MASTER... I think the slave will check is the master is alive , if not , the slave will send sync command again I tweak the repl-timeout to 600 , but I haven't tried it will work

Thanks Jing

Comment From: antirez

Hey Jing!

About fragmentation, it is possible that the problem is the objects over 4k, it is knonw that jemalloc will waste a considerable amount of memory in that case because I think it allocates 8k for, for instance, a 5k allocation.

About the replication timeout thing, yep the repl-timeout thing will fix it I believe, but I want to look into it more closely because this is a recurring issue with very large data sets. However with 2.8 and the partial resynchronization a lot of this issues will go away automatically.

Please let me know what happens after the 2.6 upgrade, I'll keep this issue open for a while. Than you.

Comment From: zhangjing

Hi antirez

I will have a vacation for chinese new year . I will try 2.6 version after the vacation . The vacation will last to 15 days. pls wait me

Thanks Jing

Comment From: antirez

Have a wonderful vacation @zhangjing, we'll wait for sure :-) Thanks again for the help.

Comment From: enjoy-binbin

this issue is old, i think now we are good with everything?