Redis setting and twemproxy: We have six instances of Redis(32 bit) version 3.2.3 running on the same server (AWS EC2 instance r3.xlarge) along with Twemproxy. It was running smoothly until few days. Individual instances have started giving OOM issues. Not able to identify the exact issue of Out of memory.

Redis conf:

BG save is triggered every half an hour through a script for each instance in sequence. It is turned off in conf file. Kept maxmemory at 3.5 GB for all six Redis instances. Eviction policy to volatile TTL Till now the peak usage of each Redis instance is around 2.2GB so it has still not touched the 3.5 maxmemory limit. Things I searched and fixed:

Increased max connections to 65535 vm.overcommit_memory to 1 (Can this be the reason for OOM?) THP disabled Increased ulimit From what is understand if Redis requires memory it should use it from buffer memory. Output of free command:

                total       used       free     shared    buffers    cached
   Mem:         30679      27430       3249          0        288      11481
   -/+ buffers/cache:      15660      15019
  Swap:            0          0          0

Output of top command:

Error log of Redis:

1491:M 05 Jul 21:01:55.090 * Background saving terminated with success 1491:M 05 Jul 21:19:32.121 # Out Of Memory allocating 32774 bytes!

=== REDIS BUG REPORT START: Cut & paste starting from here === 1491:M 05 Jul 21:19:32.121 # ------------------------------------------------ 1491:M 05 Jul 21:19:32.121 # !!! Software Failure. Press left mouse button to continue 1491:M 05 Jul 21:19:32.122 # Guru Meditation: "Redis aborting for OUT OF MEMORY" #server.c:3817 1491:M 05 Jul 21:19:32.122 # (forcing SIGSEGV in order to print the stack trace) 1491:M 05 Jul 21:19:32.122 # ------------------------------------------------ 1491:M 05 Jul 21:19:32.123 # Redis 3.2.3 crashed by signal: 11 1491:M 05 Jul 21:19:32.123 # Crashed running the instuction at: 0x80a5c4d 1491:M 05 Jul 21:19:32.123 # Accessing address: 0xffffffff 1491:M 05 Jul 21:19:32.123 # Failed assertion: (:0)

------ STACK TRACE ------ EIP: /usr/local/bin/redis-server *:6381(_serverPanic+0x8d)[0x80a5c4d]

Backtrace: /usr/local/bin/redis-server :6381(logStackTrace+0x35)[0x80a7675] /usr/local/bin/redis-server :6381(sigsegvHandler+0xc2)[0x80a7bf2] [0xf77dc420] /usr/local/bin/redis-server :6381(_serverPanic+0x8d)[0x80a5c4d] /usr/local/bin/redis-server :6381(redisOutOfMemoryHandler+0x3b)[0x806807b] /usr/local/bin/redis-server :6381(zmalloc+0x61)[0x8072621] /usr/local/bin/redis-server :6381(sdsMakeRoomFor+0x185)[0x8070245] /usr/local/bin/redis-server :6381(readQueryFromClient+0x87)[0x807b447] /usr/local/bin/redis-server :6381(aeProcessEvents+0x291)[0x80644f1] /usr/local/bin/redis-server :6381(aeMain+0x2c)[0x806477c] /usr/local/bin/redis-server :6381(main+0x454)[0x80614a4] /lib32/libc.so.6(__libc_start_main+0xf3)[0xf75ddad3] /usr/local/bin/redis-server *:6381[0x806175c]

------ INFO OUTPUT ------

Server

redis_version:3.2.3 redis_git_sha1:00000000 redis_git_dirty:0 redis_build_id:bcabd22adb39b278 redis_mode:standalone os:Linux 3.13.0-74-generic x86_64 arch_bits:32 multiplexing_api:epoll gcc_version:4.8.4 process_id:1491 run_id:653209594a744f112fb3c7915d3c349a2c65b912 tcp_port:6381 uptime_in_seconds:26753226 uptime_in_days:309 hz:10 lru_clock:6117348 executable:/usr/local/bin/redis-server config_file:/etc/redis/redis_6381.conf

Clients

connected_clients:20 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0

Memory

used_memory:2326410216 used_memory_human:2.17G used_memory_rss:2691825664 used_memory_rss_human:2.51G used_memory_peak:2340860240 used_memory_peak_human:2.18G total_system_memory:2105458688 total_system_memory_human:1.96G used_memory_lua:25600 used_memory_lua_human:25.00K maxmemory:3758096384 maxmemory_human:3.50G maxmemory_policy:volatile-ttl mem_fragmentation_ratio:1.16 mem_allocator:jemalloc-4.0.3

Persistence

loading:0 rdb_changes_since_last_save:3132 rdb_bgsave_in_progress:0 rdb_last_save_time:1499288515 rdb_last_bgsave_status:ok rdb_last_bgsave_time_sec:34 rdb_current_bgsave_time_sec:-1 aof_enabled:0 aof_rewrite_in_progress:0 aof_rewrite_scheduled:0 aof_last_rewrite_time_sec:-1 aof_current_rewrite_time_sec:-1 aof_last_bgrewrite_status:ok aof_last_write_status:ok

Stats

total_connections_received:171113 total_commands_processed:82044062 instantaneous_ops_per_sec:47 total_net_input_bytes:127272847135 total_net_output_bytes:565881208424 instantaneous_input_kbps:20.80 instantaneous_output_kbps:1176.51 rejected_connections:0 sync_full:0 sync_partial_ok:0 sync_partial_err:0 expired_keys:7334558 evicted_keys:0 keyspace_hits:35375621 keyspace_misses:8547031 pubsub_channels:0 pubsub_patterns:0 latest_fork_usec:44027 migrate_cached_sockets:0

Replication

role:master connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0

CPU

used_cpu_sys:11970.49 used_cpu_user:8139.96 used_cpu_sys_children:12063.56 used_cpu_user_children:106498.20

Commandstats

cmdstat_get:calls=41565702,usec=241468654,usec_per_call=5.81 cmdstat_set:calls=8389482,usec=67020049,usec_per_call=7.99 cmdstat_del:calls=591225,usec=2263300,usec_per_call=3.83 cmdstat_exists:calls=287486,usec=828495,usec_per_call=2.88 cmdstat_sadd:calls=10618626,usec=25885617,usec_per_call=2.44 cmdstat_sismember:calls=771569,usec=897763,usec_per_call=1.16 cmdstat_smembers:calls=174,usec=5667,usec_per_call=32.57 cmdstat_expire:calls=16306131,usec=37164332,usec_per_call=2.28 cmdstat_scan:calls=243377,usec=3670788,usec_per_call=15.08 cmdstat_auth:calls=14946,usec=32970,usec_per_call=2.21 cmdstat_bgsave:calls=14805,usec=295936307,usec_per_call=19988.94 cmdstat_lastsave:calls=75581,usec=188956,usec_per_call=2.50 cmdstat_flushall:calls=6,usec=406220,usec_per_call=67703.33 cmdstat_info:calls=118318,usec=5922315,usec_per_call=50.05 cmdstat_monitor:calls=17,usec=19,usec_per_call=1.12 cmdstat_ttl:calls=3046578,usec=7189430,usec_per_call=2.36 cmdstat_command:calls=39,usec=19330,usec_per_call=495.64

Cluster

cluster_enabled:0

Keyspace

db0:keys=394418,expires=353679,avg_ttl=5553166456 hash_init_value: 1473181388

------ CLIENT LIST OUTPUT ------ id=12 addr=127.0.0.1:44210 fd=16 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=13 addr=127.0.0.1:44211 fd=17 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=14 addr=127.0.0.1:44212 fd=18 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=15 addr=127.0.0.1:44213 fd=19 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=16 addr=127.0.0.1:44214 fd=20 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=17 addr=127.0.0.1:44215 fd=21 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=18 addr=127.0.0.1:44219 fd=22 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=set id=19 addr=127.0.0.1:44220 fd=23 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=20 addr=127.0.0.1:44221 fd=24 name= age=26752669 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=get id=21 addr=127.0.0.1:44222 fd=25 name= age=26752669 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=2 addr=127.0.0.1:44142 fd=6 name= age=26752852 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=3 addr=127.0.0.1:44199 fd=7 name= age=26752669 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=4 addr=127.0.0.1:44201 fd=8 name= age=26752669 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=5 addr=127.0.0.1:44202 fd=9 name= age=26752669 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=6 addr=127.0.0.1:44203 fd=10 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=7 addr=127.0.0.1:44204 fd=11 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=8 addr=127.0.0.1:44205 fd=12 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=9 addr=127.0.0.1:44207 fd=13 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=10 addr=127.0.0.1:44208 fd=14 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get id=11 addr=127.0.0.1:44209 fd=15 name= age=26752669 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=get

------ REGISTERS ------ 1491:M 05 Jul 21:19:32.131 # EAX:00000000 EBX:00000000 ECX:f776b420 EDX:00000000 EDI:00000002 ESI:00008006 EBP:00000005 ESP:ff881300 SS :0000002b EFL:ff881300 EIP:080a5c4d CS:00000023 DS :0000002b ES :0000002b FS :00000000 GS:00000063 1491:M 05 Jul 21:19:32.131 # (ff88130f) -> f7010c09 1491:M 05 Jul 21:19:32.131 # (ff88130e) -> 00000ee9 1491:M 05 Jul 21:19:32.131 # (ff88130d) -> 0812a6a0 1491:M 05 Jul 21:19:32.131 # (ff88130c) -> 081280e0 1491:M 05 Jul 21:19:32.131 # (ff88130b) -> 0806807b 1491:M 05 Jul 21:19:32.131 # (ff88130a) -> 00000000 1491:M 05 Jul 21:19:32.131 # (ff881309) -> 00000002 1491:M 05 Jul 21:19:32.131 # (ff881308) -> f7010c20 1491:M 05 Jul 21:19:32.131 # (ff881307) -> e3769a00 1491:M 05 Jul 21:19:32.131 # (ff881306) -> 00000000 1491:M 05 Jul 21:19:32.131 # (ff881305) -> 00008006 1491:M 05 Jul 21:19:32.131 # (ff881304) -> 00000ee9 1491:M 05 Jul 21:19:32.131 # (ff881303) -> 0812a6a0 1491:M 05 Jul 21:19:32.131 # (ff881302) -> 081280e0 1491:M 05 Jul 21:19:32.131 # (ff881301) -> 08131cac 1491:M 05 Jul 21:19:32.131 # (ff881300) -> 00000003

------ FAST MEMORY TEST ------ 1491:M 05 Jul 21:19:32.133 # Bio thread for job type #0 terminated 1491:M 05 Jul 21:19:32.134 # Bio thread for job type #1 terminated *** Preparing to test memory region 200000 (132120576 bytes) *** Preparing to test memory region 816b000 (49152 bytes) *** Preparing to test memory region 8200000 (14680064 bytes) *** Preparing to test memory region 91a5000 (135168 bytes) *** Preparing to test memory region 9200000 (3972005888 bytes) *** Preparing to test memory region f5fff000 (8388608 bytes) *** Preparing to test memory region f6800000 (8388608 bytes) *** Preparing to test memory region f7000000 (4194304 bytes) *** Preparing to test memory region f75c3000 (4096 bytes) *** Preparing to test memory region f776c000 (12288 bytes) *** Preparing to test memory region f7789000 (8192 bytes) *** Preparing to test memory region f77d1000 (4096 bytes) *** Preparing to test memory region f77da000 (4096 bytes) *** Preparing to test memory region f77db000 (4096 bytes) *** Preparing to test memory region f7800000 (134217728 bytes) *** Preparing to test memory region ffa00000 (4194304 bytes)

Also, posted the question on stackoverflow: https://stackoverflow.com/questions/44965789/why-redis-instance-gave-out-of-memory-issue-if-server-had-plenty-of-memory-avail

Comment From: stefantalpalaru

It's a memory fragmentation issue: https://github.com/antirez/redis/issues/3079#issuecomment-187491307

Comment From: oranagra

The confusion here is that this OOM error was not a result of reaching the maxmemory setting (using 2.1 out of 3gb limit). The was a panic caused by the fact the allocator returned NULL for an request to allocate 32kb. It appears that you didn't have enough physical RAM to run so many redis instances each with a high maxmemory limit.. Closing this issue, feel free to reopen if you feel i'm wrong.