Hi,

We had an issue about high memory. Redis used memory exceeds maxmemory about 200M.

Here is the info memory:

Memory

used_memory:643380952 used_memory_human:613.58M used_memory_rss:668508160 used_memory_rss_human:637.54M used_memory_peak:643437832 used_memory_peak_human:613.63M used_memory_peak_perc:99.99% used_memory_overhead:2109178 used_memory_startup:791800 used_memory_dataset:641271774 used_memory_dataset_perc:99.79% allocator_allocated:643423408 allocator_active:643690496 allocator_resident:669794304 total_system_memory:3974373376 total_system_memory_human:3.70G used_memory_lua:37888 used_memory_lua_human:37.00K used_memory_scripts:0 used_memory_scripts_human:0B number_of_cached_scripts:0 maxmemory:429496730 maxmemory_human:409.60M maxmemory_policy:noeviction allocator_frag_ratio:1.00 allocator_frag_bytes:267088 allocator_rss_ratio:1.04 allocator_rss_bytes:26103808 rss_overhead_ratio:1.00 rss_overhead_bytes:-1286144 mem_fragmentation_ratio:1.04 mem_fragmentation_bytes:25190408 mem_not_counted_for_evict:0 mem_replication_backlog:1048576 mem_clients_slaves:0 mem_clients_normal:268802 mem_aof_buffer:0 mem_allocator:jemalloc-5.1.0 active_defrag_running:0 lazyfree_pending_objects:0

99% of the memory is used for dataset, but there is no key and expired key in any db. The issued redis is a slave in HA, and the master memory is normal. After a restart, the memory is back to normal. We have encounted such issue about 3 times with redis 4.0.11 and 5.0.5. Is this a memory leak, or a memory mechanism for performance considerations? And why used memory could exceed configured maxmemory?

Comment From: oranagra

So you're saying that the db is empty (DBSIZE is 0)? can you attach INFO ALL and MEMORY STATS?

After that, please try CLIENT KILL TYPE NORMAL to see if that helps. if not, proceed to other types of clients one by one to see if any o their releases the memory. Note this will obviously drop client connection, and cause a re-sync (hopefully partial) with the master.

Comment From: oranagra

@masteroogway123 still no hint. i suspect this is somehow occupied in clients argv buffers which is current not accounted for in any of the metrics (https://github.com/redis/redis/pull/5159). one way to find out is use CLIENT KILL to see if that releases the memory. please try: CLIENT KILL TYPE NORMAL if that doesn't help proceed with the other types one by one: REPLICA, PUBSUB, MASTER

Comment From: trevor211

What's the value of replica-ignore-maxmemory config?

Comment From: oranagra

@trevor211 why does it matter? the keyspace is empty and lazyfree_pending_objects is empty too.

Comment From: trevor211

Beacause I noticed that the output dataset.bytes is very big, close to used memory. You are right replica-ignore-maxmemory does not matter @oranagra . I agree that something is missing when we account for overhead memory.

Comment From: masteroogway123

Hi@oranagra, We encountered the issue again after the two HA servers restart. DBSIZE is also 0, but the memory is about 600M and rdb is about 220K.

Here is the server info:

1.1.1.46:6379> client list id=92090 addr=1.1.1.47:40411 fd=8 name= age=23279 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=replconf id=12249 addr=1.1.1.48:47551 fd=9 name=sentinel-39b93990-cmd age=23344 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=43159 addr=1.1.1.49:46971 fd=11 name=sentinel-c0fa0fec-cmd age=23319 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=publish id=7233261 addr=1.1.7.36:54650 fd=16 name= age=15209 idle=15209 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=auth id=13645608 addr=1.1.7.36:42162 fd=18 name= age=8009 idle=8009 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=auth id=43160 addr=1.1.0.49:40955 fd=12 name=sentinel-c0fa0fec-pubsub age=23319 idle=0 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=subscribe id=12250 addr=1.1.1.48:37109 fd=10 name=sentinel-39b93990-pubsub age=23344 idle=0 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=subscribe id=80120 addr=1.1.1.50:56732 fd=13 name=sentinel-0a391159-cmd age=23289 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=80121 addr=1.1.1.50:60751 fd=14 name=sentinel-0a391159-pubsub age=23289 idle=0 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=subscribe id=20729957 addr=1.1.0.47:45986 fd=19 name= age=75 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=26 qbuf-free=32742 obl=0 oll=0 omem=0 events=r cmd=client id=20795104 addr=1.1.7.123:60160 fd=15 name= age=0 idle=0 flags=c db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=5 oll=0 omem=0 events=r cmd=client id=20795105 addr=1.1.7.123:60162 fd=17 name= age=0 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=NULL 1.1.1.46:6379> memory stats 1) "peak.allocated" 2) (integer) 501668864 3) "total.allocated" 4) (integer) 501575672 5) "startup.allocated" 6) (integer) 791784 7) "replication.backlog" 8) (integer) 1048576 9) "clients.slaves" 10) (integer) 49710 11) "clients.normal" 12) (integer) 300468 13) "aof.buffer" 14) (integer) 0 15) "lua.caches" 16) (integer) 0 17) "overhead.total" 18) (integer) 2190538 19) "keys.count" 20) (integer) 0 21) "keys.bytes-per-key" 22) (integer) 0 23) "dataset.bytes" 24) (integer) 499385134 25) "dataset.percentage" 26) "99.720687866210938" 27) "peak.percentage" 28) "99.981422424316406" 29) "allocator.allocated" 30) (integer) 501567192 31) "allocator.active" 32) (integer) 501788672 33) "allocator.resident" 34) (integer) 524017664 35) "allocator-fragmentation.ratio" 36) "1.0004415512084961" 37) "allocator-fragmentation.bytes" 38) (integer) 221480 39) "allocator-rss.ratio" 40) "1.0442994832992554" 41) "allocator-rss.bytes" 42) (integer) 22228992 43) "rss-overhead.ratio" 44) "0.99543511867523193" 45) "rss-overhead.bytes" 46) (integer) -2392064 47) "fragmentation" 48) "1.0401060581207275" 49) "fragmentation.bytes" 50) (integer) 20113656 1.1.1.46:6379> info

Server

redis_version:5.0.5 redis_git_sha1:00000000 redis_git_dirty:0 redis_build_id:f382302517001576 redis_mode:standalone os:Linux 3.10.0-693.21.1.el7.x86_64 x86_64 arch_bits:64 multiplexing_api:epoll atomicvar_api:atomic-builtin gcc_version:8.3.0 process_id:188 run_id:f50cc089a55b45ddf7267d4724f6ba383cdd3699 tcp_port:6379 uptime_in_seconds:31319 uptime_in_days:0 hz:10 configured_hz:10 lru_clock:6246841 executable:/home/redis/bin/redis-server config_file:/home/redis/volume/data/conf/9ac03b3e-262c-4b16-bd7e-9b8f73f36809-haredis-0/redis.conf version:3.05.04 config_timestamp:1600051101506 redis-type:0

Clients

connected_clients:10 client_recent_max_input_buffer:4 client_recent_max_output_buffer:0 blocked_clients:0

Memory

used_memory:672846984 used_memory_human:641.68M used_memory_rss:699424768 used_memory_rss_human:667.02M used_memory_peak:673133304 used_memory_peak_human:641.95M used_memory_peak_perc:99.96% used_memory_overhead:2092222 used_memory_startup:791784 used_memory_dataset:670754762 used_memory_dataset_perc:99.81% allocator_allocated:672909424 allocator_active:673161216 allocator_resident:701505536 total_system_memory:3974373376 total_system_memory_human:3.70G used_memory_lua:37888 used_memory_lua_human:37.00K used_memory_scripts:0 used_memory_scripts_human:0B number_of_cached_scripts:0 maxmemory:429496730 maxmemory_human:409.60M maxmemory_policy:noeviction allocator_frag_ratio:1.00 allocator_frag_bytes:251792 allocator_rss_ratio:1.04 allocator_rss_bytes:28344320 rss_overhead_ratio:1.00 rss_overhead_bytes:-2080768 mem_fragmentation_ratio:1.04 mem_fragmentation_bytes:26683776 mem_not_counted_for_evict:0 mem_replication_backlog:1048576 mem_clients_slaves:16938 mem_clients_normal:234924 mem_aof_buffer:0 mem_allocator:jemalloc-5.1.0 active_defrag_running:0 lazyfree_pending_objects:0

Persistence

loading:0 rdb_changes_since_last_save:0 rdb_bgsave_in_progress:0 rdb_last_save_time:1600051117 rdb_last_bgsave_status:ok rdb_last_bgsave_time_sec:0 rdb_current_bgsave_time_sec:-1 rdb_last_cow_size:475136 aof_enabled:0 aof_rewrite_in_progress:0 aof_rewrite_scheduled:0 aof_last_rewrite_time_sec:-1 aof_current_rewrite_time_sec:-1 aof_last_bgrewrite_status:ok aof_last_write_status:ok aof_last_cow_size:0

Stats

total_connections_received:27944534 total_commands_processed:28143014 instantaneous_ops_per_sec:523 total_net_input_bytes:3229244772 total_net_output_bytes:1878911527 instantaneous_input_kbps:58.04 instantaneous_output_kbps:38.74 rejected_connections:0 sync_full:1 sync_partial_ok:0 sync_partial_err:1 expired_keys:0 expired_stale_perc:0.00 expired_time_cap_reached_count:0 evicted_keys:0 keyspace_hits:0 keyspace_misses:0 pubsub_channels:1 pubsub_patterns:0 latest_fork_usec:281 migrate_cached_sockets:0 slave_expires_tracked_keys:0 active_defrag_hits:0 active_defrag_misses:0 active_defrag_key_hits:0 active_defrag_key_misses:0

Replication

role:master connected_slaves:1 slave0:ip=1.1.1.47,port=6379,state=online,offset=6341305,lag=1 master_replid:2795782601ac4abcd872b81dc4a1c0d5f394f5ee master_replid2:0000000000000000000000000000000000000000 master_repl_offset:6341579 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:5293004 repl_backlog_histlen:1048576

CPU

used_cpu_sys:3728.443389 used_cpu_user:1249.485039 used_cpu_sys_children:0.001422 used_cpu_user_children:0.000000

Cluster

cluster_enabled:0

Keyspace

1.1.1.46:6379>

There is a clue that after the restart, a client sends AUTH frequently due to auth fail. We have tried the solution of CLINET KILL, but memory is also high. What shall we do next for troubleshooting?

Comment From: oranagra

@masteroogway123 that info you provided above is right after restart? i see quite a lot of commands and connections already handled:

total_connections_received:27944534
total_commands_processed:28143014

you say that killing all the connected clients didn't release memory?

it's probably a silly test but let's do it anyway: try to copy the rdb file and start another redis instance from that rdb file (with no clients). i expect the memory usage to be very small, but maybe we'll be surprised.

Comment From: masteroogway123

Sorry, this problem has not recurred recently, so we have to close it first and provide valuable information if it appears later.