Redis 5-7% Performance regression from v5 to v6.2 to unstable due to added features ( more visible on pipeline )

With the introduction of some of v7.0 features we've measured a max overhead of 7% drop in the achievable ops/sec on a simple standalone deployment and a GET benchmark with 1KiB values. I've used fae5b1a19d0972c2f4274004f15be3d2f90c856c as the "unstable" reference.
This is more evident on 10-15 pipeline results, as outline on the following chart and table (given we're reducing the syscalls and RTT time overhead) :

pipeline on GET	5.0.13	6.2.6	unstable	% change unstable vs 6.2	% change unstable vs 5.0
1	147925	145662	141062	---%	---%
5	333895	329112	316108	---%	-5.3%
10	445349	436320	414768	-5%	-6.9%
15	502064	491278	467111	-5%	-7.0%
20	505349	493715	487269	-1.3%	-3.6%
25	517902	506944	502212	-0.9%	-3.0%
30	534072	524354	510230	-2.7%	-4.5%

We can pinpoint around 5-6% of the CPU cycles to the following functions and inner code:

Funtion	%CPU time	Note
updateCachedTime	1.80%	#9194
updateClientMemUsage	1.50%	(after the improvement of https://github.com/redis/redis/pull/10401 )
ACLCheckAllUserCommandPerm	1.20%	#9974
updateCommandLatencyHistogram	0.80%	#9462

Associated with each function I've added the last PR that touched the code/introduced the feature. I would suggest we analyze further each of the features to try to reduce/squeeze as much performance as possible. IMHO the features introduced are quite valid/requirement so we need to try to reduce this overhead as much as possible.

Comment From: oranagra

@filipecosta90 how did you conclude the regression in updateCachedTime is from #9194? AFAICT it didn't change anything in that respect.

Comment From: filipecosta90

@filipecosta90 how did you conclude the regression in updateCachedTime is from #9194? AFAICT it didn't change anything in that respect.

you're absolutely right @oranagra. I just pointed to it via git blame. So let's see what's the real PR doing the introduction.

Comment From: filipecosta90

Reminder that after https://github.com/redis/redis/pull/10502 we need to revive this data for unstable. Will produce a new chart and reply further to this issue.

Comment From: filipecosta90

Updated numbers using the current unstable code from Wed Apr 20 ( 3cd8baf61610416aab45e0bcedcaab9beae80184 ). With the work of the last month ( between march 20 and april 20 ) the regression was reduced ~5% vs v.5 and it's now at a around 2-3% at worst case vs v.6.2. Furthermore, at high pipeline numbers v7.0 ( unstable ) outperforms v6.2 and it's equal to v5 even with the newly added logic.

To reproduce:

run redis

taskset -c 0 `pwd`/src/redis-server --logfile redis-opt.log --save "" --daemonize yes

run the following script for each of the these redis versions

D=60
DATASIZE=1000
P=performance
rm results.csv

CORES="1,2"
# populate
taskset -c $CORES memtier_benchmark -d $DATASIZE --ratio 1:0 --key-pattern=P:P -t 2 --hide-histogram --key-maximum=1000000 --key-minimum 1

# benchmark
for pipeline in 1 5 10 15 20 25 30; do
    taskset -c $CORES memtier_benchmark -d $DATASIZE --ratio 0:1 --test-time $D --pipeline $pipeline --key-pattern=P:P -t 2 -o $pipeline.txt --hide-histogram --key-maximum=1000000 --key-minimum 1
    cat $pipeline.txt | grep Totals | awk -v r=$pipeline '{print r " , " $2}' >>results.csv
done

At the end of each run check the results.csv file.

Redis 5-7% Performance regression from v5 to v6.2 to unstable due to added features ( more visible on pipeline )

Comment From: oranagra

I think we may have finished dealing with regression, but there are still some ideas for improvement, two mentioned in https://github.com/redis/redis/pull/10697#issuecomment-1137334208 which we should find time to evaluate.