Hello! I want to benchmark Redis. I wanted to use YCSB as it will allow me to compare Redis with other databases as well. As far as I can see a standalone Redis node benchmark has been done by you with YCSB and you have made some improvements on the tool. But while comparing redis-benchmark tool results and YCSB I can see huge discrepancies (even while not using pipeline in redis-benchmarks tool). Specifically with YCSB I get around 30,000 ops/sec for GET commands and with redis-benchmark 111,594 ops/sec and as you can see the difference is huge. I used redis-benchmark without pipelining because YCSB doesn't support pipelining and the results wouldn't be fair. Is not YCSB a good benchmarking tool for redis?

Comment From: filipecosta90

Hi there @giannisgrigorakos it is worth noticing that if you're running the workload A of YCSB the commands being issued are not GET, but HMSET ( as a side note we need to update it to HSET for pure sake of using a command not deprecated ) and HGETALL. ( If you're using a different workload please discard my answer and provide the steps used for the YSCB benchmark ). commandstats of the default run stage:

$ redis-cli info commandstats
# Commandstats
cmdstat_hmset:calls=506,usec=1665,usec_per_call=3.29
cmdstat_hgetall:calls=494,usec=2541,usec_per_call=5.14
cmdstat_config:calls=1,usec=56,usec_per_call=56.00
```` 
Here is a simple monitor portion of the run stage of the YSCB workload A benchmark:

(...) 1600433693.139107 [0 127.0.0.1:53758] "HMSET" "user7928534804371831711" "field1" "\"Bs>]'\"Xc7't/#v\"Sw7$p;\-/.p+,t-?: 8b%/n\"/20Pm'<h!/l6O;:9692:U?(0p$X-5X7%b;'2.Y9)Q95\"j+M/0O;5H}6W{:" 1600433693.139269 [0 127.0.0.1:53758] "HMSET" "user8179944504032318020" "field2" "<,d'Dc\";f&Z{8];$-0%!<4N1'?h&Y92x#Ck $.-,v</5,l&9&!N'$V==B/3l3T)&!~&!t5N%%<<65d30z / 58=Pk'Ki,6<(" 1600433693.139413 [0 127.0.0.1:53758] "HMSET" "user4866862849619894024" "field8" "6K9$/>-_g&,t1Zc>Tc4I\x7f.5l ^u\"8p-P1 21@c?5.$F!4V#9Bo&>;42$?8(Kc/Z77:n#86,\w&#d=4f?P=/'0-Qy6V3-#0?)x'" 1600433693.139546 [0 127.0.0.1:53758] "HMSET" "user773649639740817034" "field6" "3=41C}&>(4Qo1/9@g?He?)$82.!Qg292=Gc 1v6::C!(.d>C#8De1\"z\"G%'L?9L57I}'W\x7f.[o'A3, f'Ai?B7 :j.Q--Kc&<j6" 1600433693.139653 [0 127.0.0.1:53758] "HGETALL" "user8280647719838722367" 1600433693.139756 [0 127.0.0.1:53758] "HGETALL" "user6862728708791239180" 1600433693.139853 [0 127.0.0.1:53758] "HGETALL" "user3227796087934707205" (...)

Lets check the memory usage of one of those keys:

$ redis-cli memory usage user8652283639112666078 (integer) 1649

And what's in it as well:

$ redis-cli hgetall user8652283639112666078 1) "field8" 2) "4#\"=?p%G=9J+E=-7|?4%%2#,9*|262(?r&Zc#0,(H\x7f*?81'0!Hc2Hy.*,4_;$Cy7(*<Pm:8&=8~.+&<Jg-38!K1,\\{/Li:De2" 3) "field2" 4) "#W'0]q(4:(@s)8<91v%7r;<|=Z=9627*d7 n1Q/<H)7_g&#z/-$5_9&?*4G#2$<5[70)(=':*\"$ U{>9&,B3)Cw&N?'0<1Fm)\" 0" 5) "field1" 6) "8I)-Vi,)j.166 *!#|6=~&&&6\"&1+47S!)B?<G;0 :+?|7>j-[#%La/:l;_e;?.4:\"4N3%+x\"B1=V7-(j#1n.Ce&)d;=28G=#!." 7) "field3" 8) "6E+$(.5Tq#526p.];'&h/=;7,=Es\")t'Ag5Y55Va,F;8Jq2]+%T7:B=0':/-~6Ng2\"%D- 1:\\94!j+/h#P1,#\"%1r&8%Y#8" 9) "field5" 10) "8<.-%l+-:4_s%&v^{$@s3V7\"?r+0~?,:)<~&2=\".<8v7#>-]u?<42&l2Vw&R)5%|:Oq+V!?9n&X) %,t%.80;r4%4<\"f''|." 11) "field6" 12) "05j.$~9M)!Vo2J-4Pa4Gc3V!#(n\"L-=A3(5,?844&:(_3-6$7Hc?06%$5Kw(44%*$.V-,Y!;12 X7:*f+1v27.0((?<$ 2(9(&$" 13) "field0" 14) ")L;-Ws5Ay?Mk6D;1Wg7W\x7f.-62=l>Ey\"=&?Ui>K}7B5!C;,Wm' 63_y\" (>8 $" 15) "field7" 16) "4F?);0-'l+0p8Ue,0.3x#@7+=l&9n40b\")~0Bg\"&+#b#V)\"F- G;\"N7<^a>,45:z07F#52f51\"=-f0K? D}.%f&:48A; " 17) "field9" 18) "#4`'^u+/>8$,.(d7,f>.\"4(:4N1<8v?_%1#.4Ci.-4\"]o/?\"13<5\".<^7$9,)2j?I!=D'<#&5Ke79 -Y% -.,Vo#J{! f\"\",7Bq:" 19) "field4" 20) "$7x+)4/A',>(8+x.Vs7'l&0\"=L' Um'-n\"G;?Q?1 z0&!><\"\%142?<~3+(9Qc?Rq-t>B)9Z98Zs>[51@9#E+)Cw2R%$S3=. $"

## Differences between YSCB and redis-benchmark
Up until now we've seen that we're not using the same datatypes, and consequently commands, and that the datasize is also very different ( 3 Bytes default on redis-benchmark and ~1.7KB on a sample YSCB key ). 

## Is YCSB is a valid benchmark
Yes! Even tough you should not compare tools, depending on the use-case YSCB can be very helpful providing a common baseline across the multiple supported DBs and providing more complex datasets. As a follow up on my side I can check if we can do anything regarding improving the redis driver implementation there ( pipeline, different setups, etc... ). 
Bottom line you should use the tool that fits most your needs/use-case. But they all can bring something to the table and should not be discarded - just use the same tool during the analysis process ( don't compare YSCB vs redis-benchmark vs memtier, etc... )

## Further follow up


Please reply if this answered your questions regarding the benchmarks and if not please do provide as much details as possible so that we can discuss this further :) 

-------
# Appendix:

Followed the quickstart/install steps as [specified here](https://github.com/brianfrankcooper/YCSB/tree/master/redis#quick-start).

## setup step

./bin/ycsb load redis -s -P workloads/workloada -p "redis.host=127.0.0.1" -p "redis.port=6379" > outputLoad.txt


## run workload A

./bin/ycsb run redis -s -P workloads/workloada -p "redis.host=127.0.0.1" -p "redis.port=6379" > outputRun.txt ```

Comment From: giannisgrigorakos

Thank you for your quick and thorough answer. I think that YCSB client for redis needs some improvements (for sure to support pipelining) because a workload of 50/50 update/read has more throughput than a workload of 100% read but this is another story I guess. Despite that I consider my question answered! Thank you for your time! Keep up the great work that you are doing guys!