Hi, I tested different version with memtier_benchmark, 7.x and 6.x still about 3%~4% perfomace degrade compare with 5.x in my test.

My test is GET (data is 80 byte of key lenth + 10 byte of value lenth ) with pipeline=1 and io-threads 1. The metric is peak ops of different version with CPU 99%+ and P99 < 2ms.

Test Round1 Round2 Round3 Avg
5.0.8 151670.79 152295.51 152884.41 152283.57
6.0.20 (1 thread) 146305.31 148291.65 148512.65 147703.20
7.0.14 (1 thread) 146662.74 146746.68 146894.20 146767.87
7.2.1 (1 thread) 144511.86 146727.36 146442.26 145893.83
7.2.2 (1 thread) 146329.83 147099.60 144859.61 146096.35

My test script is

memtier_benchmark -s addr -p port --key-prefix="TEST_XXXXXXXXX_XXXXXX_XXXXX_XXXXXX_XXX_XXXXXX##XXX_XXXXXXXXXXXXXXXX_260_202310092050_" --key-maximum=150000 --key-pattern=S:S -P redis --ratio 0:1 --test-time=600 -d 10 -R -t 10 -c 20 --hide-histogram --out-file=./GET_pipe1_redisx-round1

I don't know if there are some features introduced that cause performance degradation. Or is There any better performance metrics?

thx~^-^

Comment From: sundb

@scottlii How did you populate the test data?

Comment From: scottlii

@scottlii How did you populate the test data?

@sundb I populated data with this script every time:(Not elegant but didn’t know how to set expire time in memtier_benchmark)

# 80bytes data prepare
for((i=0;i<150001;i++)); do 
  redis-cli -h 10.195.93.0 -p 6000 SETEX TEST_XXXXXXXXX_XXXXXX_XXXXX_XXXXXX_XXX_XXXXXX##XXX_XXXXXXXXXXXXXXXX_260_202310092050_${i} 2592000 XXXXXXXXXX done

Each test is 100% hits. And Here I post the test result generated by memtier_benchmark.

See test_report.txt

Comment From: sundb

@scottlii Thx, I'll test it on my local machine.

Comment From: judeng

@scottlii could you provide your config files? some features perhaps will reduce the performance

Comment From: scottlii

@scottlii could you provide your config files? some features perhaps will reduce the performance

@judeng @sundb Okey, Here is the config file, see conf_example.txt And mention these key points:

  • The only different configuration between 5.x and 6.x / 7.x is that 6.x & 7.x have io-threads 1
  • All the tests run in cluster mode with only one shard(1 master + 1 replica), so I set cluster-enabled yes and assign slots.
  • All the tests set save "" and daemonize yes

Comment From: filipecosta90

@scottlii I believe if we profile this we should get to the same conclusions of comment in https://github.com/redis/redis/issues/10981#issuecomment-1185134267

I was able to measure ~=3.7% overhead, which was described/justified before in https://github.com/redis/redis/issues/10460 due to the following features:

Funtion %CPU time Note
updateClientMemUsage 1.50% (after the improvement of #10401 )
ACLCheckAllUserCommandPerm 1.20% #9974
updateCommandLatencyHistogram (can be disabled) 0.80% #9462

Notice the updateCommandLatencyHistogram can be disabled. the ACL and memory tracking no.