I want to execute stress test on Redis cluster(3 master, 3 slave) with redis-benchmark. At the beginning of the test, I started a redis-benchmark and increased the number of parallel connections. I noticed that the QPS reaches the bottleneck, but the CPU rate is about 25%. Then I increased the number of redis-benchmark and start them in the same time.
I use the redis_exporter , and display the metrics on Grafana.
3 redis-benchmark, the CPU rate was about50%
6 redis-benchmark, the CPU rate was about 70%
But when 9 redis-benchmark, the CPU rate didn't increase much.
rate(redis_cpu_user_main_thread_seconds_total{instance=~"$instance"})
I'm sure that my network IO don't reach the bottleneck, the bandwidth of instance is 8GBps, the network IO is far less than it.
I want to see the CPU rate 100%, should i continue to increase the number of redis-benchmark?
Is there a good way to execute stress test on Redis cluster with redis-benchmark?
Redis version - 7.0.7
Image - Canonical-Ubuntu-22.04-2022.11.06-0
Redis cluster*6 - 4 vCPU - 16GiB memory
Redis-benchmark*9 - 4 vCPU - 16GiB memory
Redis and redis-benchmark are not on the same instance
Comment From: nedataghizadeh79
It's possible that your Redis cluster has reached its maximum capacity and is unable to handle more load, which is why increasing the number of redis-benchmark instances doesn't result in a corresponding increase in CPU utilization.
Before increasing the number of redis-benchmark instances, you might consider other factors that could be limiting your cluster's performance. For example:
Memory usage: Make sure that you have enough available memory to handle the increased load. Disk I/O: Ensure that your disk I/O isn't becoming a bottleneck. Network I/O: Monitor your network I/O to make sure that it's not limiting your performance. Configuration: Ensure that your Redis cluster is properly configured for your workload. You might also want to consider using alternative stress testing tools, such as Apache JMeter or Gatling, which can help you generate more complex workloads and better simulate real-world scenarios.
In general, the best way to execute a stress test on a Redis cluster is to start with a realistic workload that represents your typical usage patterns and gradually increase the load until you start to see performance degradation. This will help you identify the limits of your cluster and fine-tune your configuration for maximum performance.
Comment From: zuiderkwast
redis-benchmark can sometimes be the bottleneck itself in benchmarking.
To make it more efficient, I recommend using the options
-P 10-- Pipeline 10 commands at a time. Without pipelining, redis-benchmark waits for a reply to each command before it sends the next command.--threads 4-- Use all of the CPUs you have available.
Comment From: niheartent
Thank you all, I found that the CPU rate on prometheus is the user mode rate. In fact, the CPU running the Redis main thread has reached 100%.
In CPU3, user mode + system mode=100%
Is my conclusion correct ?
Comment From: ranshid
@niheartent it does seems like your CPUs (0,1,3) are operating at 100% cpu making about 30%-70% ratio of system calls to engine processing. I can see that the server is at almost 300% cpu are you using io-threads?
Comment From: niheartent
@ranshid
Yes, I tried to run three threads
By the way, I modifid the PromQL as following
rate(redis_cpu_user_main_thread_seconds_total{instance=~"$instance"})+rate(redis_cpu_sys_main_thread_seconds_total{instance=~"$instance"})
Comment From: ranshid
O.K @niheartent thank you. I do not see any problem with the results then. In case no other anomaly to explain can we close this issue?
Comment From: niheartent
OK