Question: redis server(on kubernetes) found 300ms+ slowlog(get command),two case as follow: what is wrong with redis, latency generated by expires?
redis command(slow log):
get xxxx:xxxx:xxx
case1 on slave redis server:
# Server
redis_version:3.2.13
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:2c98985416827093
redis_mode:standalone
os:Linux 4.19.84-300.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:45
run_id:db7397542e9079c1725b155e4e2783bcd4853ff3
tcp_port:52020
uptime_in_seconds:12022710
uptime_in_days:139
hz:10
lru_clock:7687748
executable:/usr/bin/redis-server
config_file:xxxxxxxxx
# Clients
connected_clients:1270
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
# Memory
used_memory:811357704
used_memory_human:773.77M
used_memory_rss:984678400
used_memory_rss_human:939.06M
used_memory_peak:1625789480
used_memory_peak_human:1.51G
total_system_memory:269302108160
total_system_memory_human:250.81G
used_memory_lua:37888
used_memory_lua_human:37.00K
maxmemory:0
maxmemory_human:0B
maxmemory_policy:volatile-lru
mem_fragmentation_ratio:1.21
mem_allocator:jemalloc-3.6.0
# Persistence
loading:0
rdb_changes_since_last_save:196588914
rdb_bgsave_in_progress:0
rdb_last_save_time:1611645801
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:16
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
# Stats
total_connections_received:125370079
total_commands_processed:21356960912
instantaneous_ops_per_sec:3097
total_net_input_bytes:1072009167579
total_net_output_bytes:4161366391467
instantaneous_input_kbps:150.53
instantaneous_output_kbps:586.93
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
evicted_keys:0
keyspace_hits:20798666869
keyspace_misses:10557638
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:36028
migrate_cached_sockets:0
# Replication
role:slave
master_host:xxxxxxx
master_port:52020
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:69070038018
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:104857600
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:212543.03
used_cpu_user:86169.91
used_cpu_sys_children:862.01
used_cpu_user_children:9076.89
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=2290796,expires=2290796,avg_ttl=0
watch dog backtrace:
45:signal-handler (1618283114)
--- WATCHDOG TIMER EXPIRED ---
EIP:
/usr/bin/redis-server 0.0.0.0:52020(dictFind+0x8a)[0x5564af67477a]
Backtrace:
/usr/bin/redis-server 0.0.0.0:52020(logStackTrace+0x41)[0x5564af6b37b1]
/usr/bin/redis-server 0.0.0.0:52020(watchdogSignalHandler+0x1d)[0x5564af6b385d]
/lib64/libpthread.so.0(+0xf370)[0x7f8cc1388370]
/usr/bin/redis-server 0.0.0.0:52020(dictFind+0x8a)[0x5564af67477a]
/usr/bin/redis-server 0.0.0.0:52020(getExpire+0x24)[0x5564af68c324]
/usr/bin/redis-server 0.0.0.0:52020(expireIfNeeded+0x15)[0x5564af68cc35]
/usr/bin/redis-server 0.0.0.0:52020(lookupKeyReadWithFlags+0x18)[0x5564af68cd18]
/usr/bin/redis-server 0.0.0.0:52020(lookupKeyReadOrReply+0x15)[0x5564af68cdc5]
/usr/bin/redis-server 0.0.0.0:52020(getGenericCommand+0x21)[0x5564af697cc1]
/usr/bin/redis-server 0.0.0.0:52020(call+0x71)[0x5564af6785e1]
/usr/bin/redis-server 0.0.0.0:52020(processCommand+0x3a8)[0x5564af67b978]
/usr/bin/redis-server 0.0.0.0:52020(processInputBuffer+0x12d)[0x5564af688e0d]
/usr/bin/redis-server 0.0.0.0:52020(aeProcessEvents+0x2af)[0x5564af6723cf]
/usr/bin/redis-server 0.0.0.0:52020(aeMain+0x2b)[0x5564af6726ab]
/usr/bin/redis-server 0.0.0.0:52020(main+0x355)[0x5564af66f375]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f8cc0fd9b35]
/usr/bin/redis-server 0.0.0.0:52020(+0x20623)[0x5564af66f623]
45:signal-handler (1618283114) --------
case2 on master redis server:
# Server
redis_version:3.2.13
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:2c98985416827093
redis_mode:standalone
os:Linux 4.19.84-300.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:42
run_id:ccfc94fc9fd04b2b4d4e1d93c48ca767b85c4647
tcp_port:52075
uptime_in_seconds:6668837
uptime_in_days:77
hz:10
lru_clock:7687695
executable:/usr/bin/redis-server
config_file:xxxxxx
# Clients
connected_clients:1268
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
# Memory
used_memory:869575072
used_memory_human:829.29M
used_memory_rss:1029312512
used_memory_rss_human:981.63M
used_memory_peak:1016891472
used_memory_peak_human:969.78M
total_system_memory:269302210560
total_system_memory_human:250.81G
used_memory_lua:37888
used_memory_lua_human:37.00K
maxmemory:0
maxmemory_human:0B
maxmemory_policy:volatile-lru
mem_fragmentation_ratio:1.18
mem_allocator:jemalloc-3.6.0
# Persistence
loading:0
rdb_changes_since_last_save:138219831
rdb_bgsave_in_progress:0
rdb_last_save_time:1611646051
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:0
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
# Stats
total_connections_received:84082013
total_commands_processed:16835795550
instantaneous_ops_per_sec:3111
total_net_input_bytes:831095723763
total_net_output_bytes:3331395752103
instantaneous_input_kbps:150.26
instantaneous_output_kbps:600.94
rejected_connections:0
sync_full:1
sync_partial_ok:0
sync_partial_err:0
expired_keys:20338756
evicted_keys:0
keyspace_hits:16476615815
keyspace_misses:8375157
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:335
migrate_cached_sockets:0
# Replication
role:master
connected_slaves:1
slave0:ip=xxxxxx,port=52075,state=online,offset=37421583744,lag=1
master_repl_offset:37421588800
repl_backlog_active:1
repl_backlog_size:104857600
repl_backlog_first_byte_offset:37316731201
repl_backlog_histlen:104857600
# CPU
used_cpu_sys:192021.72
used_cpu_user:85787.21
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=2135808,expires=2135808,avg_ttl=326600160
watch dog backtrace :
42:signal-handler (1618285704)
--- WATCHDOG TIMER EXPIRED ---
EIP:
/usr/bin/redis-server 0.0.0.0:52075(lookupKey+0x41)[0x55f534090d91]
Backtrace:
/usr/bin/redis-server 0.0.0.0:52075(logStackTrace+0x41)[0x55f5340b87b1]
/usr/bin/redis-server 0.0.0.0:52075(watchdogSignalHandler+0x1d)[0x55f5340b885d]
/lib64/libpthread.so.0(+0xf370)[0x7f7b7f3af370]
/usr/bin/redis-server 0.0.0.0:52075(lookupKey+0x41)[0x55f534090d91]
/usr/bin/redis-server 0.0.0.0:52075(lookupKeyReadWithFlags+0x32)[0x55f534091d32]
/usr/bin/redis-server 0.0.0.0:52075(lookupKeyReadOrReply+0x15)[0x55f534091dc5]
/usr/bin/redis-server 0.0.0.0:52075(getGenericCommand+0x21)[0x55f53409ccc1]
/usr/bin/redis-server 0.0.0.0:52075(call+0x71)[0x55f53407d5e1]
/usr/bin/redis-server 0.0.0.0:52075(processCommand+0x3a8)[0x55f534080978]
/usr/bin/redis-server 0.0.0.0:52075(processInputBuffer+0x12d)[0x55f53408de0d]
/usr/bin/redis-server 0.0.0.0:52075(aeProcessEvents+0x2af)[0x55f5340773cf]
/usr/bin/redis-server 0.0.0.0:52075(aeMain+0x2b)[0x55f5340776ab]
/usr/bin/redis-server 0.0.0.0:52075(main+0x355)[0x55f534074375]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f7b7f000b35]
/usr/bin/redis-server 0.0.0.0:52075(+0x20623)[0x55f534074623]
42:signal-handler (1618285704) --------
Comment From: seven-wjx
my situation is similar to this issue#5217#4851
Comment From: seven-wjx
the machine has more than 48 cores, it might indicate that it hasNUMA
cat /proc/sys/kernel/numa_balancing
1
perf redis dictfind function:
perf redis lookupkey. and addReplyBulkLen function:
Comment From: seven-wjx
yeah,diable numa_balancing fixed this problem
Comment From: saiisgod
Is it really the true reason? I am wondering that numa miss can get such a great latency increase. could you affirm this again. thks