Crash report Redis 5.0.7 start to crush 2 times a day.
No clue what is the problem.
Real dedicated server with Ubuntu 20.04.3 LTS and redis_version:5.0.7
Do any one have any solution for this problem?
Out Of Memory allocating 16400 bytes!
=== REDIS BUG REPORT START: Cut & paste starting from here ===
735991:M 04 Feb 2022 12:05:24.303 # ------------------------------------------------
735991:M 04 Feb 2022 12:05:24.303 # !!! Software Failure. Press left mouse button to continue
735991:M 04 Feb 2022 12:05:24.303 # Guru Meditation: Redis aborting for OUT OF MEMORY #server.c:3925
735991:M 04 Feb 2022 12:05:24.303 # (forcing SIGSEGV in order to print the stack trace)
735991:M 04 Feb 2022 12:05:24.303 # ------------------------------------------------
735991:M 04 Feb 2022 12:05:24.303 # Redis 5.0.7 crashed by signal: 11
735991:M 04 Feb 2022 12:05:24.303 # Crashed running the instruction at: 0x5573e0abf1d8
735991:M 04 Feb 2022 12:05:24.303 # Accessing address: 0xffffffffffffffff
735991:M 04 Feb 2022 12:05:24.303 # Failed assertion: <no assertion failed> (<no file>:0)
------ STACK TRACE ------
EIP:
/usr/bin/redis-server 127.0.0.1:6379(_serverPanic+0x148)[0x5573e0abf1d8]
Backtrace:
/usr/bin/redis-server 127.0.0.1:6379(logStackTrace+0x4f)[0x5573e0ac13af]
/usr/bin/redis-server 127.0.0.1:6379(sigsegvHandler+0xb5)[0x5573e0ac1bb5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fd878a593c0]
/usr/bin/redis-server 127.0.0.1:6379(_serverPanic+0x148)[0x5573e0abf1d8]
/usr/bin/redis-server 127.0.0.1:6379(redisOutOfMemoryHandler+0x3a)[0x5573e0a7733a]
/usr/bin/redis-server 127.0.0.1:6379(zmalloc+0x47)[0x5573e0a81227]
/usr/bin/redis-server 127.0.0.1:6379(+0x406a1)[0x5573e0a886a1]
/usr/bin/redis-server 127.0.0.1:6379(addReplyDouble+0xe4)[0x5573e0a89ef4]
/usr/bin/redis-server 127.0.0.1:6379(zrangeGenericCommand+0x51e)[0x5573e0aaaefe]
/usr/bin/redis-server 127.0.0.1:6379(call+0xaf)[0x5573e0a79b2f]
/usr/bin/redis-server 127.0.0.1:6379(execCommand+0xa0)[0x5573e0abe4c0]
/usr/bin/redis-server 127.0.0.1:6379(call+0xaf)[0x5573e0a79b2f]
/usr/bin/redis-server 127.0.0.1:6379(processCommand+0x556)[0x5573e0a7a436]
/usr/bin/redis-server 127.0.0.1:6379(processInputBuffer+0x171)[0x5573e0a8b0a1]
/usr/bin/redis-server 127.0.0.1:6379(aeProcessEvents+0x112)[0x5573e0a732d2]
/usr/bin/redis-server 127.0.0.1:6379(aeMain+0x2b)[0x5573e0a736fb]
/usr/bin/redis-server 127.0.0.1:6379(main+0x4de)[0x5573e0a6dade]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fd8788790b3]
/usr/bin/redis-server 127.0.0.1:6379(_start+0x2e)[0x5573e0a6dd3e]
------ INFO OUTPUT ------
# Server
redis_version:5.0.7
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:636cde3b5c7a3923
redis_mode:standalone
os:Linux 5.4.0-91-generic x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:9.2.1
process_id:735991
run_id:2d1c52d190d781a088b5c5f2fffc11bddfbfec14
tcp_port:6379
uptime_in_seconds:387322
uptime_in_days:4
hz:10
configured_hz:10
lru_clock:16586372
executable:/usr/bin/redis-server
config_file:/etc/redis/redis.conf
# Clients
connected_clients:12
client_recent_max_input_buffer:2
client_recent_max_output_buffer:57739264
blocked_clients:0
# Memory
used_memory:914899080
used_memory_human:872.52M
used_memory_rss:988934144
used_memory_rss_human:943.12M
used_memory_peak:1018331200
used_memory_peak_human:971.16M
used_memory_peak_perc:89.84%
used_memory_overhead:4830596
used_memory_startup:796240
used_memory_dataset:910068484
used_memory_dataset_perc:99.56%
allocator_allocated:911523528
allocator_active:921133056
allocator_resident:989069312
total_system_memory:67335008256
total_system_memory_human:62.71G
used_memory_lua:41984
used_memory_lua_human:41.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.01
allocator_frag_bytes:9609528
allocator_rss_ratio:1.07
allocator_rss_bytes:67936256
rss_overhead_ratio:1.00
rss_overhead_bytes:-135168
mem_fragmentation_ratio:1.09
mem_fragmentation_bytes:77643448
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:3967564
mem_aof_buffer:0
mem_allocator:jemalloc-5.2.1
active_defrag_running:0
lazyfree_pending_objects:0
# Persistence
loading:0
rdb_changes_since_last_save:11618
rdb_bgsave_in_progress:0
rdb_last_save_time:1643976309
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:4
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:67022848
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
# Stats
total_connections_received:28
total_commands_processed:243415647
instantaneous_ops_per_sec:606
total_net_input_bytes:21289486937
total_net_output_bytes:7955222927584
instantaneous_input_kbps:51.94
instantaneous_output_kbps:7.75
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:13
expired_stale_perc:0.00
expired_time_cap_reached_count:0
evicted_keys:0
keyspace_hits:2308838
keyspace_misses:562996
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:51304
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
# Replication
role:master
connected_slaves:0
master_replid:00fec0c5f2f1d8b03259a55e70f9d116be31d893
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:5862.124498
used_cpu_user:25935.504398
used_cpu_sys_children:1969.233247
used_cpu_user_children:21740.043277
# Commandstats
cmdstat_expire:calls=18216855,usec=12486181,usec_per_call=0.69
cmdstat_scan:calls=5183,usec=1698273,usec_per_call=327.66
cmdstat_zremrangebyrank:calls=66,usec=20,usec_per_call=0.30
cmdstat_zrem:calls=1229,usec=868,usec_per_call=0.71
cmdstat_del:calls=63,usec=82,usec_per_call=1.30
cmdstat_zremrangebyscore:calls=18727742,usec=261476333,usec_per_call=13.96
cmdstat_lpush:calls=18216845,usec=71587895,usec_per_call=3.93
cmdstat_smembers:calls=12896,usec=24079,usec_per_call=1.87
cmdstat_ltrim:calls=18216845,usec=44053712,usec_per_call=2.42
cmdstat_unwatch:calls=19061039,usec=4475704,usec_per_call=0.23
cmdstat_zcard:calls=224529,usec=72024,usec_per_call=0.32
cmdstat_setnx:calls=33,usec=119,usec_per_call=3.61
cmdstat_llen:calls=74843,usec=91836,usec_per_call=1.23
cmdstat_zrangebyscore:calls=648,usec=2175,usec_per_call=3.36
cmdstat_zrevrange:calls=1419118,usec=87614795,usec_per_call=61.74
cmdstat_watch:calls=10,usec=64,usec_per_call=6.40
cmdstat_exec:calls=19061029,usec=24857149394,usec_per_call=1304.08
cmdstat_command:calls=1,usec=546,usec_per_call=546.00
cmdstat_config:calls=4,usec=42,usec_per_call=10.50
cmdstat_zrange:calls=311881,usec=17758756211,usec_per_call=56940.81
cmdstat_lrange:calls=84238,usec=6204684013,usec_per_call=73656.59
cmdstat_info:calls=3,usec=618,usec_per_call=206.00
cmdstat_bgsave:calls=1,usec=48814,usec_per_call=48814.00
cmdstat_multi:calls=19061030,usec=5393238,usec_per_call=0.28
cmdstat_hset:calls=18731519,usec=40855589,usec_per_call=2.18
cmdstat_hsetnx:calls=1176,usec=1936,usec_per_call=1.65
cmdstat_zincrby:calls=10,usec=109,usec_per_call=10.90
cmdstat_exists:calls=84228,usec=186379,usec_per_call=2.21
cmdstat_hincrby:calls=36436402,usec=116822029,usec_per_call=3.21
cmdstat_ping:calls=6,usec=2,usec_per_call=0.33
cmdstat_select:calls=24,usec=5,usec_per_call=0.21
cmdstat_hdel:calls=10,usec=11,usec_per_call=1.10
cmdstat_hgetall:calls=416034,usec=2142105,usec_per_call=5.15
cmdstat_get:calls=1,usec=1,usec_per_call=1.00
cmdstat_client:calls=3,usec=8,usec_per_call=2.67
cmdstat_zadd:calls=54806686,usec=559082513,usec_per_call=10.20
cmdstat_hget:calls=243417,usec=350278,usec_per_call=1.44
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=657,expires=0,avg_ttl=0
db1:keys=533,expires=73,avg_ttl=138730199
Additional information
- OS distribution and version - Ubuntu 20.04.3 LTS CPU: Intel Xeon-E 2388G - 8c/16t - 3.2 GHz/4.6 GHz Memory: DDR4 64Gb SSD: 1.92 TB SSD NVMe
Comment From: oranagra
Looks like redis attempted to allocate some 16kb of memory and got rejected. I see you have 64gb on the machine, and that redis takes roughly 1gb. Any chance the rest is used? Is over-commit / swap enabled?
Comment From: AlexSpain
Looks like redis attempted to allocate some 16kb of memory and got rejected. I see you have 64gb on the machine, and that redis takes roughly 1gb. Any chance the rest is used? Is over-commit / swap enabled?
System use 10 GB memory / Total 64Gb Swap partition use 10Mb / Total 512Mb
Comment From: oranagra
@AlexSpain i can't think of anything else, other than maybe memory corruption. does it reproduce? if it does, maybe you can try a more recent version of Redis?
Comment From: AlexSpain
@AlexSpain i can't think of anything else, other than maybe memory corruption. does it reproduce? if it does, maybe you can try a more recent version of Redis?
It crash 2 times now. I not have option at Ubuntu 20.04.3 LTS to update redis-server it with apt-get upgrade
Comment From: AlexSpain
I update to latest redis_version: 6.2.6 and it crush again
=== REDIS BUG REPORT START: Cut & paste starting from here ===
863380:M 13 Feb 2022 07:08:23.339 # ------------------------------------------------
863380:M 13 Feb 2022 07:08:23.339 # !!! Software Failure. Press left mouse button to continue
863380:M 13 Feb 2022 07:08:23.339 # Guru Meditation: Redis aborting for OUT OF MEMORY. Allocating 16400 bytes! #server.c:5967
------ STACK TRACE ------
Backtrace:
------ INFO OUTPUT ------
# Server
redis_version:6.2.6
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:557672d61c1e18ba
redis_mode:standalone
os:Linux 5.4.0-91-generic x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:9.3.0
process_id:863380
process_supervised:systemd
run_id:d6446aef57022c6765878fc5b733dc973193d668
tcp_port:6379
server_time_usec:1644736103307240
uptime_in_seconds:235334
uptime_in_days:2
hz:10
configured_hz:10
lru_clock:568935
executable:/usr/bin/redis-server
config_file:/etc/redis/redis.conf
io_threads_active:0
# Clients
connected_clients:12
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:48
client_recent_max_output_buffer:57765936
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
# Memory
used_memory:849379576
used_memory_human:810.03M
used_memory_rss:888287232
used_memory_rss_human:847.14M
used_memory_peak:957961024
used_memory_peak_human:913.58M
used_memory_peak_perc:88.67%
used_memory_overhead:1125744
used_memory_startup:810016
used_memory_dataset:848253832
used_memory_dataset_perc:99.96%
allocator_allocated:812953864
allocator_active:818737152
allocator_resident:885293056
total_system_memory:67335008256
total_system_memory_human:62.71G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.01
allocator_frag_bytes:5783288
allocator_rss_ratio:1.08
allocator_rss_bytes:66555904
rss_overhead_ratio:1.00
rss_overhead_bytes:2994176
mem_fragmentation_ratio:1.09
mem_fragmentation_bytes:75780576
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:246008
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:24149
rdb_bgsave_in_progress:0
rdb_last_save_time:1644736069
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:4
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:51036160
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
# Stats
total_connections_received:16
total_commands_processed:143569964
instantaneous_ops_per_sec:649
total_net_input_bytes:12521880376
total_net_output_bytes:8634990129194
instantaneous_input_kbps:55.48
instantaneous_output_kbps:13472.71
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:9
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:7201
evicted_keys:0
keyspace_hits:2269668
keyspace_misses:406717
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:52570
total_forks:3656
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:360
dump_payload_sanitizations:0
total_reads_processed:44499240
total_writes_processed:149819145
io_threaded_reads_processed:0
io_threaded_writes_processed:0
# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:571c29093e08c4f23519105cc5db0bd396cef106
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:3302.327321
used_cpu_user:22073.959274
used_cpu_sys_children:1355.464370
used_cpu_user_children:11653.150833
used_cpu_sys_main_thread:3037.410686
used_cpu_user_main_thread:22052.439782
# Modules
# Commandstats
cmdstat_unwatch:calls=11336287,usec=2801652,usec_per_call=0.25,rejected_calls=0,failed_calls=0
cmdstat_zadd:calls=31912233,usec=318208277,usec_per_call=9.97,rejected_calls=194,failed_calls=0
cmdstat_zcard:calls=136916,usec=46330,usec_per_call=0.34,rejected_calls=3,failed_calls=0
cmdstat_hget:calls=254326,usec=344090,usec_per_call=1.35,rejected_calls=1,failed_calls=0
cmdstat_client:calls=2,usec=3,usec_per_call=1.50,rejected_calls=0,failed_calls=0
cmdstat_ping:calls=3,usec=0,usec_per_call=0.00,rejected_calls=0,failed_calls=0
cmdstat_smembers:calls=7844,usec=12939,usec_per_call=1.65,rejected_calls=0,failed_calls=0
cmdstat_llen:calls=45638,usec=52129,usec_per_call=1.14,rejected_calls=1,failed_calls=0
cmdstat_del:calls=37,usec=53,usec_per_call=1.43,rejected_calls=0,failed_calls=0
cmdstat_zrangebyscore:calls=394,usec=1370,usec_per_call=3.48,rejected_calls=0,failed_calls=0
cmdstat_scan:calls=3168,usec=471799,usec_per_call=148.93,rejected_calls=0,failed_calls=0
cmdstat_setnx:calls=25,usec=98,usec_per_call=3.92,rejected_calls=0,failed_calls=0
cmdstat_exists:calls=104281,usec=219506,usec_per_call=2.10,rejected_calls=5,failed_calls=0
cmdstat_lpush:calls=10603571,usec=39891705,usec_per_call=3.76,rejected_calls=0,failed_calls=0
cmdstat_bgsave:calls=1,usec=35544,usec_per_call=35544.00,rejected_calls=0,failed_calls=0
cmdstat_hincrby:calls=21208536,usec=45546797,usec_per_call=2.15,rejected_calls=0,failed_calls=0
cmdstat_multi:calls=11336284,usec=2930938,usec_per_call=0.26,rejected_calls=0,failed_calls=0
cmdstat_info:calls=3,usec=115,usec_per_call=38.33,rejected_calls=0,failed_calls=0
cmdstat_ttl:calls=2,usec=5,usec_per_call=2.50,rejected_calls=0,failed_calls=0
cmdstat_zincrby:calls=4,usec=33,usec_per_call=8.25,rejected_calls=0,failed_calls=0
cmdstat_hset:calls=10916587,usec=15430418,usec_per_call=1.41,rejected_calls=0,failed_calls=0
cmdstat_zrem:calls=599,usec=520,usec_per_call=0.87,rejected_calls=0,failed_calls=0
cmdstat_zrevrange:calls=1249027,usec=50775591,usec_per_call=40.65,rejected_calls=4,failed_calls=0
cmdstat_zremrangebyscore:calls=11032163,usec=118243706,usec_per_call=10.72,rejected_calls=143,failed_calls=0
cmdstat_expire:calls=10603575,usec=7864793,usec_per_call=0.74,rejected_calls=0,failed_calls=0
cmdstat_hdel:calls=4,usec=3,usec_per_call=0.75,rejected_calls=0,failed_calls=0
cmdstat_hsetnx:calls=566,usec=529,usec_per_call=0.93,rejected_calls=0,failed_calls=0
cmdstat_zrange:calls=249716,usec=14365210893,usec_per_call=57526.19,rejected_calls=1,failed_calls=0
cmdstat_ltrim:calls=10603571,usec=20368767,usec_per_call=1.92,rejected_calls=0,failed_calls=0
cmdstat_watch:calls=4,usec=17,usec_per_call=4.25,rejected_calls=0,failed_calls=0
cmdstat_hgetall:calls=523952,usec=2923632,usec_per_call=5.58,rejected_calls=7,failed_calls=0
cmdstat_config:calls=9,usec=83,usec_per_call=9.22,rejected_calls=0,failed_calls=0
cmdstat_get:calls=2,usec=1,usec_per_call=0.50,rejected_calls=0,failed_calls=0
cmdstat_exec:calls=11336283,usec=20586679680,usec_per_call=1816.00,rejected_calls=0,failed_calls=1
cmdstat_select:calls=15,usec=10,usec_per_call=0.67,rejected_calls=0,failed_calls=0
cmdstat_type:calls=2,usec=2,usec_per_call=1.00,rejected_calls=0,failed_calls=0
cmdstat_zremrangebyrank:calls=50,usec=19,usec_per_call=0.38,rejected_calls=0,failed_calls=0
cmdstat_lrange:calls=104284,usec=5762524756,usec_per_call=55258.00,rejected_calls=0,failed_calls=0
# Errorstats
errorstat_EXECABORT:count=1
errorstat_LOADING:count=359
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=657,expires=0,avg_ttl=0
db1:keys=596,expires=90,avg_ttl=111234375
------ CLIENT LIST OUTPUT ------
id=7 addr=127.0.0.1:25350 laddr=127.0.0.1:6379 fd=13 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=8 addr=127.0.0.1:25354 laddr=127.0.0.1:6379 fd=14 name= age=235334 idle=0 flags=x db=1 sub=0 psub=0 multi=7 qbuf=681 qbuf-free=40273 argv-mem=4 obl=205 oll=1800 omem=36907200 tot-mem=36968676 events=r cmd=exec user=default redir=-1
id=9 addr=127.0.0.1:25356 laddr=127.0.0.1:6379 fd=15 name= age=235334 idle=1 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=10 addr=127.0.0.1:25370 laddr=127.0.0.1:6379 fd=9 name= age=235333 idle=0 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=11 addr=127.0.0.1:25372 laddr=127.0.0.1:6379 fd=16 name= age=235333 idle=1 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=12 addr=127.0.0.1:25374 laddr=127.0.0.1:6379 fd=17 name= age=235333 idle=1 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=13 addr=127.0.0.1:25376 laddr=127.0.0.1:6379 fd=18 name= age=235333 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=17 addr=127.0.0.1:41736 laddr=127.0.0.1:6379 fd=20 name= age=194064 idle=193974 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=bgsave user=default redir=-1
id=14 addr=127.0.0.1:25518 laddr=127.0.0.1:6379 fd=8 name= age=235309 idle=105 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20528 events=r cmd=zrangebyscore user=default redir=-1
id=4 addr=127.0.0.1:25344 laddr=127.0.0.1:6379 fd=10 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20520 events=r cmd=zremrangebyscore user=default redir=-1
id=5 addr=127.0.0.1:25346 laddr=127.0.0.1:6379 fd=11 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=6 addr=127.0.0.1:25348 laddr=127.0.0.1:6379 fd=12 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
------ CURRENT CLIENT INFO ------
id=8 addr=127.0.0.1:25354 laddr=127.0.0.1:6379 fd=14 name= age=235334 idle=0 flags=x db=1 sub=0 psub=0 multi=7 qbuf=681 qbuf-free=40273 argv-mem=4 obl=205 oll=1800 omem=36907200 tot-mem=36968676 events=r cmd=exec user=default redir=-1
------ MODULES INFO OUTPUT ------
------ FAST MEMORY TEST ------
863380:M 13 Feb 2022 07:08:23.340 # Redis 6.2.6 crashed by signal: 6, si_code: -6
863380:M 13 Feb 2022 07:08:23.340 # Killed by PID: 863380, UID: 112
863380:M 13 Feb 2022 07:08:23.340 # Crashed running the instruction at: 0x7f791714918b
------ STACK TRACE ------
EIP:
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f791714918b]
Backtrace:
------ REGISTERS ------
863380:M 13 Feb 2022 07:08:23.340 #
RAX:0000000000000000 RBX:00007f7916f67000
RCX:00007f791714918b RDX:0000000000000000
RDI:0000000000000002 RSI:00007ffc69265470
RBP:00007ffc69265770 RSP:00007ffc69265470
R8 :0000000000000000 R9 :00007ffc69265470
R10:0000000000000008 R11:0000000000000246
R12:0000000000001000 R13:ffffffffffffffff
R14:00007f78cc38ce1d R15:00007f78f2e3bec0
RIP:00007f791714918b EFL:0000000000000246
CSGSFS:002b000000000033
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc6926547f) -> 00007f79172667c0
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc6926547e) -> 0000000000000001
Comment From: AlexSpain
Feb 13 07:08:23 ns5007838 redis-server[863380]: libgcc_s.so.1 must be installed for pthread_cancel to work Feb 13 07:08:23 ns5007838 redis-server[863380]: libgcc_s.so.1 must be installed for pthread_cancel to work Feb 13 07:08:23 ns5007838 systemd[1]: redis-server.service: Main process exited, code=dumped, status=6/ABRT Feb 13 07:08:23 ns5007838 systemd[1]: redis-server.service: Failed with result 'core-dump'. Feb 13 07:08:23 ns5007838 systemd[1]: redis-server.service: Scheduled restart job, restart counter is at 1. Feb 13 07:08:23 ns5007838 systemd[1]: Stopped Advanced key-value store. Feb 13 07:08:23 ns5007838 systemd[1]: Starting Advanced key-value store...
Comment From: TomeC
You can download the source code and compile it yourself.
Comment From: oranagra
@yossigo @yoav-steinberg maybe you can think of something i'm missing here. it fails allocating some 16k, and we see redis is consuming around 1gb and the system has some 64gb.
Comment From: yossigo
@AlexSpain Can you share the systemd unit file used to start Redis? Is it possible that you specify some systemd-level memory limit?
Comment From: yoav-steinberg
@AlexSpain Can you provide more details about the use case? It seems like the crash happens when trying to allocate a new buffer for the output buffer linked list which consists of 16400 byte buffers. This seems to be happening because of a multi-exec transaction which includes lots of commands in it, including a zrangebyscore which is attempting to write a score beyond the 16k buffer boundary prompting an allocation of a new buffer. If I could better understand what's happening in that multi-exec I might be able to try to recreate this.
Comment From: oranagra
@AlexSpain the crash log from redis 6.2.6 is also missing the stack trace, is it a copy paste error? or was it really missing?
Comment From: AlexSpain
I report crash log again
874543:C 13 Feb 2022 07:05:39.068 * DB saved on disk
874543:C 13 Feb 2022 07:05:39.177 * RDB: 19 MB of memory used by copy-on-write
863380:M 13 Feb 2022 07:05:39.306 * Background saving terminated with success
863380:M 13 Feb 2022 07:06:40.066 * 10000 changes in 60 seconds. Saving...
863380:M 13 Feb 2022 07:06:40.125 * Background saving started by pid 874544
874544:C 13 Feb 2022 07:06:44.146 * DB saved on disk
874544:C 13 Feb 2022 07:06:44.264 * RDB: 68 MB of memory used by copy-on-write
863380:M 13 Feb 2022 07:06:44.405 * Background saving terminated with success
863380:M 13 Feb 2022 07:07:45.000 * 10000 changes in 60 seconds. Saving...
863380:M 13 Feb 2022 07:07:45.053 * Background saving started by pid 874546
874546:C 13 Feb 2022 07:07:49.055 * DB saved on disk
874546:C 13 Feb 2022 07:07:49.163 * RDB: 48 MB of memory used by copy-on-write
863380:M 13 Feb 2022 07:07:49.256 * Background saving terminated with success
863380:M 13 Feb 2022 07:08:23.339 # Out Of Memory allocating 16400 bytes!
=== REDIS BUG REPORT START: Cut & paste starting from here ===
863380:M 13 Feb 2022 07:08:23.339 # ------------------------------------------------
863380:M 13 Feb 2022 07:08:23.339 # !!! Software Failure. Press left mouse button to continue
863380:M 13 Feb 2022 07:08:23.339 # Guru Meditation: Redis aborting for OUT OF MEMORY. Allocating 16400 bytes! #server.c:5967
------ STACK TRACE ------
Backtrace:
------ INFO OUTPUT ------
# Server
redis_version:6.2.6
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:557672d61c1e18ba
redis_mode:standalone
os:Linux 5.4.0-91-generic x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:9.3.0
process_id:863380
process_supervised:systemd
run_id:d6446aef57022c6765878fc5b733dc973193d668
tcp_port:6379
server_time_usec:1644736103307240
uptime_in_seconds:235334
uptime_in_days:2
hz:10
configured_hz:10
lru_clock:568935
executable:/usr/bin/redis-server
config_file:/etc/redis/redis.conf
io_threads_active:0
# Clients
connected_clients:12
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:48
client_recent_max_output_buffer:57765936
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
# Memory
used_memory:849379576
used_memory_human:810.03M
used_memory_rss:888287232
used_memory_rss_human:847.14M
used_memory_peak:957961024
used_memory_peak_human:913.58M
used_memory_peak_perc:88.67%
used_memory_overhead:1125744
used_memory_startup:810016
used_memory_dataset:848253832
used_memory_dataset_perc:99.96%
allocator_allocated:812953864
allocator_active:818737152
allocator_resident:885293056
total_system_memory:67335008256
total_system_memory_human:62.71G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.01
allocator_frag_bytes:5783288
allocator_rss_ratio:1.08
allocator_rss_bytes:66555904
rss_overhead_ratio:1.00
rss_overhead_bytes:2994176
mem_fragmentation_ratio:1.09
mem_fragmentation_bytes:75780576
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:246008
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:24149
rdb_bgsave_in_progress:0
rdb_last_save_time:1644736069
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:4
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:51036160
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
# Stats
total_connections_received:16
total_commands_processed:143569964
instantaneous_ops_per_sec:649
total_net_input_bytes:12521880376
total_net_output_bytes:8634990129194
instantaneous_input_kbps:55.48
instantaneous_output_kbps:13472.71
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:9
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:7201
evicted_keys:0
keyspace_hits:2269668
keyspace_misses:406717
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:52570
total_forks:3656
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:360
dump_payload_sanitizations:0
total_reads_processed:44499240
total_writes_processed:149819145
io_threaded_reads_processed:0
io_threaded_writes_processed:0
# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:571c29093e08c4f23519105cc5db0bd396cef106
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:3302.327321
used_cpu_user:22073.959274
used_cpu_sys_children:1355.464370
used_cpu_user_children:11653.150833
used_cpu_sys_main_thread:3037.410686
used_cpu_user_main_thread:22052.439782
# Modules
# Commandstats
cmdstat_unwatch:calls=11336287,usec=2801652,usec_per_call=0.25,rejected_calls=0,failed_calls=0
cmdstat_zadd:calls=31912233,usec=318208277,usec_per_call=9.97,rejected_calls=194,failed_calls=0
cmdstat_zcard:calls=136916,usec=46330,usec_per_call=0.34,rejected_calls=3,failed_calls=0
cmdstat_hget:calls=254326,usec=344090,usec_per_call=1.35,rejected_calls=1,failed_calls=0
cmdstat_client:calls=2,usec=3,usec_per_call=1.50,rejected_calls=0,failed_calls=0
cmdstat_ping:calls=3,usec=0,usec_per_call=0.00,rejected_calls=0,failed_calls=0
cmdstat_smembers:calls=7844,usec=12939,usec_per_call=1.65,rejected_calls=0,failed_calls=0
cmdstat_llen:calls=45638,usec=52129,usec_per_call=1.14,rejected_calls=1,failed_calls=0
cmdstat_del:calls=37,usec=53,usec_per_call=1.43,rejected_calls=0,failed_calls=0
cmdstat_zrangebyscore:calls=394,usec=1370,usec_per_call=3.48,rejected_calls=0,failed_calls=0
cmdstat_scan:calls=3168,usec=471799,usec_per_call=148.93,rejected_calls=0,failed_calls=0
cmdstat_setnx:calls=25,usec=98,usec_per_call=3.92,rejected_calls=0,failed_calls=0
cmdstat_exists:calls=104281,usec=219506,usec_per_call=2.10,rejected_calls=5,failed_calls=0
cmdstat_lpush:calls=10603571,usec=39891705,usec_per_call=3.76,rejected_calls=0,failed_calls=0
cmdstat_bgsave:calls=1,usec=35544,usec_per_call=35544.00,rejected_calls=0,failed_calls=0
cmdstat_hincrby:calls=21208536,usec=45546797,usec_per_call=2.15,rejected_calls=0,failed_calls=0
cmdstat_multi:calls=11336284,usec=2930938,usec_per_call=0.26,rejected_calls=0,failed_calls=0
cmdstat_info:calls=3,usec=115,usec_per_call=38.33,rejected_calls=0,failed_calls=0
cmdstat_ttl:calls=2,usec=5,usec_per_call=2.50,rejected_calls=0,failed_calls=0
cmdstat_zincrby:calls=4,usec=33,usec_per_call=8.25,rejected_calls=0,failed_calls=0
cmdstat_hset:calls=10916587,usec=15430418,usec_per_call=1.41,rejected_calls=0,failed_calls=0
cmdstat_zrem:calls=599,usec=520,usec_per_call=0.87,rejected_calls=0,failed_calls=0
cmdstat_zrevrange:calls=1249027,usec=50775591,usec_per_call=40.65,rejected_calls=4,failed_calls=0
cmdstat_zremrangebyscore:calls=11032163,usec=118243706,usec_per_call=10.72,rejected_calls=143,failed_calls=0
cmdstat_expire:calls=10603575,usec=7864793,usec_per_call=0.74,rejected_calls=0,failed_calls=0
cmdstat_hdel:calls=4,usec=3,usec_per_call=0.75,rejected_calls=0,failed_calls=0
cmdstat_hsetnx:calls=566,usec=529,usec_per_call=0.93,rejected_calls=0,failed_calls=0
cmdstat_zrange:calls=249716,usec=14365210893,usec_per_call=57526.19,rejected_calls=1,failed_calls=0
cmdstat_ltrim:calls=10603571,usec=20368767,usec_per_call=1.92,rejected_calls=0,failed_calls=0
cmdstat_watch:calls=4,usec=17,usec_per_call=4.25,rejected_calls=0,failed_calls=0
cmdstat_hgetall:calls=523952,usec=2923632,usec_per_call=5.58,rejected_calls=7,failed_calls=0
cmdstat_config:calls=9,usec=83,usec_per_call=9.22,rejected_calls=0,failed_calls=0
cmdstat_get:calls=2,usec=1,usec_per_call=0.50,rejected_calls=0,failed_calls=0
cmdstat_exec:calls=11336283,usec=20586679680,usec_per_call=1816.00,rejected_calls=0,failed_calls=1
cmdstat_select:calls=15,usec=10,usec_per_call=0.67,rejected_calls=0,failed_calls=0
cmdstat_type:calls=2,usec=2,usec_per_call=1.00,rejected_calls=0,failed_calls=0
cmdstat_zremrangebyrank:calls=50,usec=19,usec_per_call=0.38,rejected_calls=0,failed_calls=0
cmdstat_lrange:calls=104284,usec=5762524756,usec_per_call=55258.00,rejected_calls=0,failed_calls=0
# Errorstats
errorstat_EXECABORT:count=1
errorstat_LOADING:count=359
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=657,expires=0,avg_ttl=0
db1:keys=596,expires=90,avg_ttl=111234375
------ CLIENT LIST OUTPUT ------
id=7 addr=127.0.0.1:25350 laddr=127.0.0.1:6379 fd=13 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=8 addr=127.0.0.1:25354 laddr=127.0.0.1:6379 fd=14 name= age=235334 idle=0 flags=x db=1 sub=0 psub=0 multi=7 qbuf=681 qbuf-free=40273 argv-mem=4 obl=205 oll=1800 omem=36907200 tot-mem=36968676 events=r cmd=exec user=default redir=-1
id=9 addr=127.0.0.1:25356 laddr=127.0.0.1:6379 fd=15 name= age=235334 idle=1 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=10 addr=127.0.0.1:25370 laddr=127.0.0.1:6379 fd=9 name= age=235333 idle=0 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=11 addr=127.0.0.1:25372 laddr=127.0.0.1:6379 fd=16 name= age=235333 idle=1 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=12 addr=127.0.0.1:25374 laddr=127.0.0.1:6379 fd=17 name= age=235333 idle=1 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=13 addr=127.0.0.1:25376 laddr=127.0.0.1:6379 fd=18 name= age=235333 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=17 addr=127.0.0.1:41736 laddr=127.0.0.1:6379 fd=20 name= age=194064 idle=193974 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=bgsave user=default redir=-1
id=14 addr=127.0.0.1:25518 laddr=127.0.0.1:6379 fd=8 name= age=235309 idle=105 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20528 events=r cmd=zrangebyscore user=default redir=-1
id=4 addr=127.0.0.1:25344 laddr=127.0.0.1:6379 fd=10 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20520 events=r cmd=zremrangebyscore user=default redir=-1
id=5 addr=127.0.0.1:25346 laddr=127.0.0.1:6379 fd=11 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=6 addr=127.0.0.1:25348 laddr=127.0.0.1:6379 fd=12 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
------ CURRENT CLIENT INFO ------
id=8 addr=127.0.0.1:25354 laddr=127.0.0.1:6379 fd=14 name= age=235334 idle=0 flags=x db=1 sub=0 psub=0 multi=7 qbuf=681 qbuf-free=40273 argv-mem=4 obl=205 oll=1800 omem=36907200 tot-mem=36968676 events=r cmd=exec user=default redir=-1
argv[0]: 'LRANGE'
argv[1]: 'users:data'
argv[2]: '0'
argv[3]: '1176000'
863380:M 13 Feb 2022 07:08:23.340 # key 'users:data' found in DB containing the following object:
863380:M 13 Feb 2022 07:08:23.340 # Object type: 1
863380:M 13 Feb 2022 07:08:23.340 # Object encoding: 9
863380:M 13 Feb 2022 07:08:23.340 # Object refcount: 1
------ MODULES INFO OUTPUT ------
------ FAST MEMORY TEST ------
863380:M 13 Feb 2022 07:08:23.340 # Redis 6.2.6 crashed by signal: 6, si_code: -6
863380:M 13 Feb 2022 07:08:23.340 # Killed by PID: 863380, UID: 112
863380:M 13 Feb 2022 07:08:23.340 # Crashed running the instruction at: 0x7f791714918b
------ STACK TRACE ------
EIP:
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f791714918b]
Backtrace:
------ REGISTERS ------
863380:M 13 Feb 2022 07:08:23.340 #
RAX:0000000000000000 RBX:00007f7916f67000
RCX:00007f791714918b RDX:0000000000000000
RDI:0000000000000002 RSI:00007ffc69265470
RBP:00007ffc69265770 RSP:00007ffc69265470
R8 :0000000000000000 R9 :00007ffc69265470
R10:0000000000000008 R11:0000000000000246
R12:0000000000001000 R13:ffffffffffffffff
R14:00007f78cc38ce1d R15:00007f78f2e3bec0
RIP:00007f791714918b EFL:0000000000000246
CSGSFS:002b000000000033
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc6926547f) -> 00007f79172667c0
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc6926547e) -> 0000000000000001
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc6926547d) -> 00007f78d5452005
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc6926547c) -> 00007f7916a09c5b
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc6926547b) -> 0000000000000005
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc6926547a) -> 0000008000000010
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265479) -> 0000562291a8755f
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265478) -> 00007f78d5452000
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265477) -> 00007ffc69265510
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265476) -> 0000000000000001
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265475) -> e8a273d1fd93c500
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265474) -> 00000000000002f2
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265473) -> 00007ffc69265510
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265472) -> 00007ffc6926577e
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265471) -> e8a273d1fd93c500
863380:M 13 Feb 2022 07:08:23.340 # (00007ffc69265470) -> 0000000000000000
------ INFO OUTPUT ------
# Server
redis_version:6.2.6
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:557672d61c1e18ba
redis_mode:standalone
os:Linux 5.4.0-91-generic x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:9.3.0
process_id:863380
process_supervised:systemd
run_id:d6446aef57022c6765878fc5b733dc973193d668
tcp_port:6379
server_time_usec:1644736103307240
uptime_in_seconds:235334
uptime_in_days:2
hz:10
configured_hz:10
lru_clock:568935
executable:/usr/bin/redis-server
config_file:/etc/redis/redis.conf
io_threads_active:0
# Clients
connected_clients:12
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:48
client_recent_max_output_buffer:57765936
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
# Memory
used_memory:849379792
used_memory_human:810.03M
used_memory_rss:888287232
used_memory_rss_human:847.14M
used_memory_peak:957961024
used_memory_peak_human:913.58M
used_memory_peak_perc:88.67%
used_memory_overhead:1125744
used_memory_startup:810016
used_memory_dataset:848254048
used_memory_dataset_perc:99.96%
allocator_allocated:812953864
allocator_active:818737152
allocator_resident:885293056
total_system_memory:67335008256
total_system_memory_human:62.71G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.01
allocator_frag_bytes:5783288
allocator_rss_ratio:1.08
allocator_rss_bytes:66555904
rss_overhead_ratio:1.00
rss_overhead_bytes:2994176
mem_fragmentation_ratio:1.09
mem_fragmentation_bytes:75780576
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:246008
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:24149
rdb_bgsave_in_progress:0
rdb_last_save_time:1644736069
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:4
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:51036160
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
# Stats
total_connections_received:16
total_commands_processed:143569964
instantaneous_ops_per_sec:649
total_net_input_bytes:12521880376
total_net_output_bytes:8634990129194
instantaneous_input_kbps:55.48
instantaneous_output_kbps:13472.71
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:9
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:7201
evicted_keys:0
keyspace_hits:2269668
keyspace_misses:406717
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:52570
total_forks:3656
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:360
dump_payload_sanitizations:0
total_reads_processed:44499240
total_writes_processed:149819145
io_threaded_reads_processed:0
io_threaded_writes_processed:0
# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:571c29093e08c4f23519105cc5db0bd396cef106
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:3302.327353
used_cpu_user:22073.959488
used_cpu_sys_children:1355.464370
used_cpu_user_children:11653.150833
used_cpu_sys_main_thread:3037.410716
used_cpu_user_main_thread:22052.439998
# Modules
# Commandstats
cmdstat_unwatch:calls=11336287,usec=2801652,usec_per_call=0.25,rejected_calls=0,failed_calls=0
cmdstat_zadd:calls=31912233,usec=318208277,usec_per_call=9.97,rejected_calls=194,failed_calls=0
cmdstat_zcard:calls=136916,usec=46330,usec_per_call=0.34,rejected_calls=3,failed_calls=0
cmdstat_hget:calls=254326,usec=344090,usec_per_call=1.35,rejected_calls=1,failed_calls=0
cmdstat_client:calls=2,usec=3,usec_per_call=1.50,rejected_calls=0,failed_calls=0
cmdstat_ping:calls=3,usec=0,usec_per_call=0.00,rejected_calls=0,failed_calls=0
cmdstat_smembers:calls=7844,usec=12939,usec_per_call=1.65,rejected_calls=0,failed_calls=0
cmdstat_llen:calls=45638,usec=52129,usec_per_call=1.14,rejected_calls=1,failed_calls=0
cmdstat_del:calls=37,usec=53,usec_per_call=1.43,rejected_calls=0,failed_calls=0
cmdstat_zrangebyscore:calls=394,usec=1370,usec_per_call=3.48,rejected_calls=0,failed_calls=0
cmdstat_scan:calls=3168,usec=471799,usec_per_call=148.93,rejected_calls=0,failed_calls=0
cmdstat_setnx:calls=25,usec=98,usec_per_call=3.92,rejected_calls=0,failed_calls=0
cmdstat_exists:calls=104281,usec=219506,usec_per_call=2.10,rejected_calls=5,failed_calls=0
cmdstat_lpush:calls=10603571,usec=39891705,usec_per_call=3.76,rejected_calls=0,failed_calls=0
cmdstat_bgsave:calls=1,usec=35544,usec_per_call=35544.00,rejected_calls=0,failed_calls=0
cmdstat_hincrby:calls=21208536,usec=45546797,usec_per_call=2.15,rejected_calls=0,failed_calls=0
cmdstat_multi:calls=11336284,usec=2930938,usec_per_call=0.26,rejected_calls=0,failed_calls=0
cmdstat_info:calls=3,usec=115,usec_per_call=38.33,rejected_calls=0,failed_calls=0
cmdstat_ttl:calls=2,usec=5,usec_per_call=2.50,rejected_calls=0,failed_calls=0
cmdstat_zincrby:calls=4,usec=33,usec_per_call=8.25,rejected_calls=0,failed_calls=0
cmdstat_hset:calls=10916587,usec=15430418,usec_per_call=1.41,rejected_calls=0,failed_calls=0
cmdstat_zrem:calls=599,usec=520,usec_per_call=0.87,rejected_calls=0,failed_calls=0
cmdstat_zrevrange:calls=1249027,usec=50775591,usec_per_call=40.65,rejected_calls=4,failed_calls=0
cmdstat_zremrangebyscore:calls=11032163,usec=118243706,usec_per_call=10.72,rejected_calls=143,failed_calls=0
cmdstat_expire:calls=10603575,usec=7864793,usec_per_call=0.74,rejected_calls=0,failed_calls=0
cmdstat_hdel:calls=4,usec=3,usec_per_call=0.75,rejected_calls=0,failed_calls=0
cmdstat_hsetnx:calls=566,usec=529,usec_per_call=0.93,rejected_calls=0,failed_calls=0
cmdstat_zrange:calls=249716,usec=14365210893,usec_per_call=57526.19,rejected_calls=1,failed_calls=0
cmdstat_ltrim:calls=10603571,usec=20368767,usec_per_call=1.92,rejected_calls=0,failed_calls=0
cmdstat_watch:calls=4,usec=17,usec_per_call=4.25,rejected_calls=0,failed_calls=0
cmdstat_hgetall:calls=523952,usec=2923632,usec_per_call=5.58,rejected_calls=7,failed_calls=0
cmdstat_config:calls=9,usec=83,usec_per_call=9.22,rejected_calls=0,failed_calls=0
cmdstat_get:calls=2,usec=1,usec_per_call=0.50,rejected_calls=0,failed_calls=0
cmdstat_exec:calls=11336283,usec=20586679680,usec_per_call=1816.00,rejected_calls=0,failed_calls=1
cmdstat_select:calls=15,usec=10,usec_per_call=0.67,rejected_calls=0,failed_calls=0
cmdstat_type:calls=2,usec=2,usec_per_call=1.00,rejected_calls=0,failed_calls=0
cmdstat_zremrangebyrank:calls=50,usec=19,usec_per_call=0.38,rejected_calls=0,failed_calls=0
cmdstat_lrange:calls=104284,usec=5762524756,usec_per_call=55258.00,rejected_calls=0,failed_calls=0
# Errorstats
errorstat_EXECABORT:count=1
errorstat_LOADING:count=359
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=657,expires=0,avg_ttl=0
db1:keys=596,expires=90,avg_ttl=111234375
------ CLIENT LIST OUTPUT ------
id=7 addr=127.0.0.1:25350 laddr=127.0.0.1:6379 fd=13 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=8 addr=127.0.0.1:25354 laddr=127.0.0.1:6379 fd=14 name= age=235334 idle=0 flags=x db=1 sub=0 psub=0 multi=7 qbuf=681 qbuf-free=40273 argv-mem=4 obl=205 oll=1800 omem=36907200 tot-mem=36968676 events=r cmd=exec user=default redir=-1
id=9 addr=127.0.0.1:25356 laddr=127.0.0.1:6379 fd=15 name= age=235334 idle=1 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=10 addr=127.0.0.1:25370 laddr=127.0.0.1:6379 fd=9 name= age=235333 idle=0 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=11 addr=127.0.0.1:25372 laddr=127.0.0.1:6379 fd=16 name= age=235333 idle=1 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=12 addr=127.0.0.1:25374 laddr=127.0.0.1:6379 fd=17 name= age=235333 idle=1 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=13 addr=127.0.0.1:25376 laddr=127.0.0.1:6379 fd=18 name= age=235333 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=17 addr=127.0.0.1:41736 laddr=127.0.0.1:6379 fd=20 name= age=194064 idle=193974 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=bgsave user=default redir=-1
id=14 addr=127.0.0.1:25518 laddr=127.0.0.1:6379 fd=8 name= age=235309 idle=105 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20528 events=r cmd=zrangebyscore user=default redir=-1
id=4 addr=127.0.0.1:25344 laddr=127.0.0.1:6379 fd=10 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20520 events=r cmd=zremrangebyscore user=default redir=-1
id=5 addr=127.0.0.1:25346 laddr=127.0.0.1:6379 fd=11 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
id=6 addr=127.0.0.1:25348 laddr=127.0.0.1:6379 fd=12 name= age=235334 idle=12 flags=N db=1 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=unwatch user=default redir=-1
------ CURRENT CLIENT INFO ------
id=8 addr=127.0.0.1:25354 laddr=127.0.0.1:6379 fd=14 name= age=235334 idle=0 flags=x db=1 sub=0 psub=0 multi=7 qbuf=681 qbuf-free=40273 argv-mem=4 obl=205 oll=1800 omem=36907200 tot-mem=36968676 events=r cmd=exec user=default redir=-1
argv[0]: 'LRANGE'
argv[1]: 'users:data'
argv[2]: '0'
argv[3]: '1176000'
863380:M 13 Feb 2022 07:08:23.340 # key 'users:data' found in DB containing the following object:
863380:M 13 Feb 2022 07:08:23.340 # Object type: 1
863380:M 13 Feb 2022 07:08:23.340 # Object encoding: 9
863380:M 13 Feb 2022 07:08:23.340 # Object refcount: 1
------ MODULES INFO OUTPUT ------
------ FAST MEMORY TEST ------
874559:C 13 Feb 2022 07:08:23.684 * Supervised by systemd. Please make sure you set appropriate values for TimeoutStartSec and TimeoutStopSec in your service unit.
Comment From: AlexSpain
@yossigo
Can you share the systemd unit file used to start Redis? Is it possible that you specify some systemd-level memory limit?
[Unit]
Description=Advanced key-value store
After=network.target
Documentation=http://redis.io/documentation, man:redis-server(1)
[Service]
Type=notify
ExecStart=/usr/bin/redis-server /etc/redis/redis.conf
ExecStop=/bin/kill -s TERM $MAINPID
PIDFile=/run/redis/redis-server.pid
TimeoutStopSec=0
Restart=always
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=2755
UMask=007
PrivateTmp=yes
LimitNOFILE=65535
PrivateDevices=yes
ProtectHome=yes
ReadOnlyDirectories=/
ReadWriteDirectories=-/var/lib/redis
ReadWriteDirectories=-/var/log/redis
ReadWriteDirectories=-/run/redis
NoNewPrivileges=true
CapabilityBoundingSet=CAP_SETGID CAP_SETUID CAP_SYS_RESOURCE
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
MemoryDenyWriteExecute=true
ProtectKernelModules=true
ProtectKernelTunables=true
ProtectControlGroups=true
RestrictRealtime=true
RestrictNamespaces=true
# redis-server can write to its own config file when in cluster mode so we
# permit writing there by default. If you are not using this feature, it is
# recommended that you replace the following lines with "ProtectSystem=full".
ProtectSystem=true
ReadWriteDirectories=-/etc/redis
[Install]
WantedBy=multi-user.target
Alias=redis.service`
Comment From: sundb
The last crash was caused by lrange users:data 0 1176000, what is the size of this key?
It is odd, why the average request time of lrange is 55ms.
Is it possible that the OOM is caused by a corrupted ziplist? ~~used_memory is not updated during the execution of commands.~~
cmdstat_lrange:calls=104284,usec=5762524756,usec_per_call=55258.00,rejected_calls=0,failed_calls=0
@AlexSpain Can you change config sanitize-dump-payload to yes and restart redis again?
Comment From: oranagra
@sundb what do you mean by
used_memoryis not updated during the execution of commands.
The crash report should be up to date with the latest allocations.
Looking at the recent crash reports, they don't contain stack trace, which is very odd.
Comment From: sundb
@oranagra Silly me, take back what I said.
Comment From: AlexSpain
The last crash was caused by
lrange users:data 0 1176000, what is the size of this key? It is odd, why the average request time oflrangeis 55ms.
DEBUG OBJECT users:data Value at:0x7f9292a490a0 refcount:1 encoding:quicklist serializedlength:15001524 lru:748983 lru_seconds_idle:0 ql_nodes:6358 ql_avg_node:184.96 ql_ziplist_max:-2 ql_compressed:0 ql_uncompressed_size:51813982
MEMORY USAGE users:data 42627942
Comment From: sundb
@AlexSpain I wonder why the encoding of users.data is not quicklist, but skiplist, which is shown as list in the crash log.
Comment From: AlexSpain
@AlexSpain I wonder why the encoding of
users.datais notquicklist, butskiplist, which is shown aslistin the crash log.
Sorry, I copy stats from my home redis what is not production one, I edit post with info from production server
Comment From: oranagra
to clarify, it seems the key was a list type and is now a sorted set
Comment From: yoav-steinberg
Some things to note:
The crash always seems to happen while executing an EXEC. Also I noticed there are always 7 commands in the multi-exec transaction. Also the crash always happens when there are 1800 elements in the reply buffer list and we're trying to allocate another one.
@AlexSpain Can you try to pinpoint where in your app you have a multi-exec transaction with 7 commands in it? What are the other commands other than lrange?
Comment From: AlexSpain
tx.LPush(r.formatKey("user:data"), login) tx.LTrim(r.formatKey("user:data"), 0, r.sized) tx.HIncrBy(r.formatKey("info", "rCurrent"), login, dinf) tx.ZAdd(r.formatKey("data2"), redis.Z{Score: float64(ts), Member: join(dinf, login, id, ms, dinf, hostname)}) tx.ZAdd(r.formatKey("data2", login), redis.Z{Score: float64(ts), Member: join(dinf, id, ms, dinf, hostname)}) tx.Expire(r.formatKey("data2", login), expire) tx.HSet(r.formatKey("peoples", login), "lastTime", strconv.FormatInt(ts, 10))
Same code work at Ubuntu 16.04 (2 years without any restart) with old redis (redis_version:5.0.7) without any crash.
Comment From: yoav-steinberg
Hmm.. that's strange, no transaction with access to the user:data key and/or using the lrange or zrange commands? Because it seems that those were being used during a 7 command transaction during the crash.
Comment From: AlexSpain
@yoav-steinberg first two commands tx.LPush(r.formatKey("user:data"), login) tx.LTrim(r.formatKey("user:data"), 0, r.sized)
Comment From: yoav-steinberg
@oranagra Looking at this, from the crash log it seems like we're inside a transaction (exec) but the argv array shows an lrange command. From what I see in the code this debug print should have printed the argv of the exec command.
I suspect there's some corruption here.
Also looking at the output size omem it seems like we weren't processing the transaction but actually the lrange command. @AlexSpain provided the transaction code above and there's nothing there that I'd expect to produce a ~37MB reply, while lrange for over a million items can easily produce this.
------ CURRENT CLIENT INFO ------
id=8 addr=127.0.0.1:25354 laddr=127.0.0.1:6379 fd=14 name= age=235334 idle=0 flags=x db=1 sub=0 psub=0 multi=7 qbuf=681 qbuf-free=40273 argv-mem=4 obl=205 oll=1800 omem=36907200 tot-mem=36968676 events=r cmd=exec user=default redir=-1
argv[0]: 'LRANGE'
argv[1]: 'users:data'
argv[2]: '0'
argv[3]: '1176000'
863380:M 13 Feb 2022 07:08:23.340 # key 'users:data' found in DB containing the following object:
863380:M 13 Feb 2022 07:08:23.340 # Object type: 1
863380:M 13 Feb 2022 07:08:23.340 # Object encoding: 9
863380:M 13 Feb 2022 07:08:23.340 # Object refcount: 1
Comment From: AlexSpain
@yoav-steinberg what I need to do for stop redis to crash? It crash today again, same error same log
Comment From: oranagra
@AlexSpain please try to enable sanitize-dump-payload and restart as suggested here.
if there's some corruption of the data that could lead to corruption of memory, maybe this can smoke it out...
Comment From: yoav-steinberg
@AlexSpain Are you sure there's no call to lrange on user:data inside some transaction? Can you take a look and if you find it paste the transaction code here like you did before?
Also how did you build your redis 5.0.7 and 6.2.6? Are you using your own build or Ubuntu's deb? Are you using a deb from the distro's standard repository or some other ppa? If you use a custom build can you sent your build command line?
Comment From: oranagra
I see that the Redis 5.0 crash you posted was using jemalloc-5.2.1 (latest jemalloc, that's not part of redis 5.0). and the Redis 6.2 crash you posted was using jemalloc-5.1.0 (which was indeed bundled in redis 6.2). the only explanation i can think of here, is that for some reason the allocator returned NULL even though the system has a lot of memory. this could maybe happen when the allocator's memory got corrupted.
Comment From: yossigo
@oranagra I assume this is because Redis 5.0.7 comes bundled with Ubuntu, which builds Redis to use the system jemalloc (following official Debian policy), and Redis 6.2.6 was built locally.
Comment From: AlexSpain
@yoav-steinberg
@AlexSpain Are you sure there's no call to
lrangeonuser:datainside some transaction? Can you take a look and if you find it paste the transaction code here like you did before?this code run around 7000 times per minute
tx.LPush(r.formatKey("user:data"), login)
tx.LTrim(r.formatKey("user:data"), 0, r.sized)
tx.HIncrBy(r.formatKey("info", "rCurrent"), login, dinf)
tx.ZAdd(r.formatKey("data2"), redis.Z{Score: float64(ts), Member: join(dinf, login, id, ms, dinf, hostname)})
tx.ZAdd(r.formatKey("data2", login), redis.Z{Score: float64(ts), Member: join(dinf, id, ms, dinf, hostname)})
tx.Expire(r.formatKey("data2", login), expire)
tx.HSet(r.formatKey("peoples", login), "lastTime", strconv.FormatInt(ts, 10))
code run every 5 seconds
tx := r.client.Multi()
defer tx.Close()
cmds, err := tx.Exec(func() error {
tx.HGetAllMap(r.formatKey("peoples", login))
tx.ZRevRangeWithScores(r.formatKey("payments", login), 0, maxPayments-1)
tx.HGet(r.formatKey("paymentsTotal"), login)
tx.HGet(r.formatKey("info", "currentS"), login)
tx.LRange(r.formatKey("user:data"), 0, r.sized)
tx.ZRevRangeWithScores(r.formatKey("rewards", login), 0, 39)
tx.ZRevRangeWithScores(r.formatKey("rewards", login), 0, -1)
})
code run 2-5 times a day
ms := util.MakeTimestamp()
cmds, err := tx.Exec(func() error {
r.writeS(tx, ms, ts, login, id, dinf, window, hostname)
tx.HSet(r.formatKey("stats"), "lastBF", strconv.FormatInt(ts, 10))
tx.HDel(r.formatKey("stats"), "roundS")
tx.ZIncrBy(r.formatKey("finders"), 1, login)
tx.HIncrBy(r.formatKey("peoples", login), "bFound", 1)
tx.HGetAllMap(r.formatKey("info", "rCurrent"))
tx.Del(r.formatKey("info", "rCurrent"))
tx.LRange(r.formatKey("user:data"), 0, r.sized)
return nil
})
Also how did you build your redis 5.0.7 and 6.2.6? Are you using your own build or Ubuntu's deb? Are you using a deb from the distro's standard repository or some other ppa? If you use a custom build can you sent your build command line? Is how I install Redis 6.2.6 to Ubuntu 20.04 what have before default Redis 5.0.7 comes bundled with Ubuntu
$ curl https://packages.redis.io/gpg | sudo apt-key add -
$ echo "deb https://packages.redis.io/deb $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/redis.list
$ sudo apt-get update
$ sudo apt-get install redis
Comment From: AlexSpain
Is can be what it crash because tx.LTrim(r.formatKey("user:data"), 0, r.sized) and tx.LRange(r.formatKey("user:data"), 0, r.sized)
run same time?
Comment From: oranagra
they're not running at the same time, just as part of the same transaction, which is very similar to just running them in a pipeline one after the other. i don't see how this could cause any issue.
Comment From: AlexSpain
they're not running at the same time, just as part of the same transaction, which is very similar to just running them in a pipeline one after the other. i don't see how this could cause any issue.
@oranagra they run in parallel in two different applications with same Redis DB app one run around 7000 times per minute tx.LTrim(r.formatKey("user:data"), 0, r.sized)
app two run every 5 seconds tx.LRange(r.formatKey("user:data"), 0, r.sized)
Comment From: oranagra
ok, but Redis is single threaded, and processes a single command (or transaction) at a time. so whichever comes first will be served first, not at the same time. anyway, that's not the the trigger to this crash.
since i'm out of ideas, if this is non-production, or you have a way to reproduce this in a staging environment, maybe you can give redis 7.0 RC1 a try (maybe we'll be lucky and in some way it'll be manifested differently there and we'll be able to figure it out)
Comment From: AlexSpain
I make some research I change timing to run scripts to make less call to redis per min this code run around 5000 times per minute
tx.LPush(r.formatKey("user:data"), login) tx.LTrim(r.formatKey("user:data"), 0, r.sized) tx.HIncrBy(r.formatKey("info", "rCurrent"), login, dinf) tx.ZAdd(r.formatKey("data2"), redis.Z{Score: float64(ts), Member: join(dinf, login, id, ms, dinf, hostname)}) tx.ZAdd(r.formatKey("data2", login), redis.Z{Score: float64(ts), Member: join(dinf, id, ms, dinf, hostname)}) tx.Expire(r.formatKey("data2", login), expire) tx.HSet(r.formatKey("peoples", login), "lastTime", strconv.FormatInt(ts, 10))
redis stop to crash every 24 hours, now it crash every 2 days
Comment From: AlexSpain
sanitize-dump-payload
26 at 17:00 I change sanitize-dump-payload to yes and restart redis . 24 hours without crash. Waiting more days
Comment From: oranagra
@AlexSpain please follow the discussion in https://github.com/redis/redis/issues/10319, this could be due to vm.max_map_count, please try to increase it and see if the problem is gone.
Comment From: AlexSpain
@AlexSpain please follow the discussion in #10319, this could be due to
vm.max_map_count, please try to increase it and see if the problem is gone.
cat /proc/sys/vm/max_map_count 65530
What number I can write? sysctl -w vm.max_map_count=262144 is ok to test?
Comment From: oranagra
i suppose so... don't have any experience with what.
Comment From: AlexSpain
After I apply sysctl -w vm.max_map_count=262144 and restart redis, redis server not crash 5 days.
Comment From: yossigo
Interesting! So assuming this is really the issue, I think we still don't have an answer to why this has started to happen all of a sudden and how come it does not affect many systems. I'd expect this to either be directly affected by jemalloc, or by anything else that loads into the process (other shared libs, maybe LD_PRELOAD, etc.).
Comment From: kletord
I encounter this error on a Redis 5.0.14 with jemalloc 5.1.0. Redis running out of memory due to a virtual memory map exhaustion.
I found the root cause to be https://github.com/jemalloc/jemalloc/issues/1328, after changing the vm.overcommit to 0 and restart the redis process, everything was stable again.
Comment From: oranagra
very interesting... but Redis cannot usually afford to run without overcommit (forks will fail if you're close to your system resources). i wonder what causes some (relatively small) deployments to hit this problem, while others (bigger ones) don't.
Comment From: oranagra
ohh, i now realize the problem with jemalloc causing fragmented mapping is when overcommit is disabled (not when enabled).
@kletord you said that setting it to 0 (heuristic) solved it, can you confirm it previously set to 2 (disabled, which isn't really suitable for redis)?
can everyone else who reported this problem please report the value of /proc/sys/vm/overcommit_memory and try to set it to 1 to solve the problem.