I have been trying to implement some thing using redis sets. One thing that i have noticed considerably while using these commands are that Both Sinter and Sunion have a O(n) complexity but Sunion degrades considerably even when set cardinality is around 500
Tried running this benchmark:
$ for i in seq 0 999; do redis-cli sadd s1000 forbar$i; done
...
$ for i in seq 0 9; do redis-cli sadd s10 foobar$i; done
...
$ redis-benchmark -n 10000 SUNION s1000 s10
Gives me a throughput of 950.48 requests per second
where as benchmarking SInter with a fairly high complexity O(NM) redis-benchmark -n 10000 SINTER s1000 s10* Gives me a throughput of 47169.81 requests per second
Can someone explain me whether this is ideal or not. And if ideal the reason behind it.
System Details: redis_version:4.0.9 Mac pro 8 cores 16 gigs RAM
Comment From: itamarhaber
Hello @raunakb94
Comparing SUNION to SINTER isn't fair - in your original question I referred to SINTER/SMEMBERS as means for returning an entire set, which is what SUNION does when provided with a single set.
That said, I feel that SUNION could be optimized.
Ref: https://stackoverflow.com/questions/52238376/redis-sets-performance-issue
Comment From: itamarhaber
That said, I feel that SUNION could be optimized.
Clarification: in the case of SUNION, when no dstkey is provided, the temporary results set is still allocated only to be turned into a reply and discarded later. This is wasteful and can be avoided with similar control structures as in SINTER.
Comment From: raunakb94
Hey @itamarhaber yes that can be true I tried using the same benchmarks on different redis versions for SUNION. for 3.2.12 its giving me ~ 2979.74 requests per second. Isn't it weird in case of redis_version:4.0.9
Comment From: itamarhaber
So, IIUC, you're saying there's a performance regression between 3.2 and 4.0?
Comment From: raunakb94
@itamarhaber as per my notions : Yes.
Comment From: itamarhaber
I tried reproducing the regression you've reported (attached below) but failed.
Averages
| Version | PING | SUNION s10 | SMEMBERS s10 | SUNION s1000 | SMEMBERS s1000 | SUNION s1000 s10 | SINTER s1000 s10 |
|---|---|---|---|---|---|---|---|
| 3.2 | 19,009.27 | 18,046.71 | 17,669.60 | 2,491.58 | 2,502.02 | 2,611.21 | 16,618.56 |
| 4 | 19,910.87 | 17,291.03 | 16,191.63 | 2,243.32 | 2,615.91 | 2,279.42 | 17,884.46 |
| 5.0-rc4 | 18,185.43 | 17,500.48 | 18,039.07 | 2,387.30 | 2,622.96 | 2,242.09 | 17,871.43 |
Raw data
"Run", "Version", "PING", "SUNION s10", "SMEMBERS s10", "SUNION s1000", "SMEMBERS s1000", "SUNION s1000 s10", "SINTER s1000 s10"
"1", "3.2", "18018.02", "17152.66", "17421.60", "2484.47", "2532.93", "2681.68", "15822.78"
"1", "4.0", "20040.08", "18115.94", "16260.16", "2173.91", "2659.57", "2314.81", "17605.63"
"1", "5.0-rc4", "17953.32", "18083.18", "17006.80", "2461.24", "2683.84", "2244.17", "18450.19"
"2", "3.2", "18348.62", "17793.60", "17793.60", "2443.79", "2404.42", "2547.77", "17421.60"
"2", "4.0", "19531.25", "18083.18", "17094.02", "2221.24", "2604.84", "2287.81", "18348.62"
"2", "5.0-rc4", "19120.46", "16750.42", "19157.09", "2331.55", "2628.81", "2222.72", "18214.94"
"3", "3.2", "20661.16", "19193.86", "17793.60", "2546.47", "2568.71", "2604.17", "16611.29"
"3", "4.0", "20161.29", "15673.98", "15220.70", "2334.81", "2583.31", "2235.64", "17699.12"
"3", "5.0-rc4", "17482.52", "17667.85", "17953.32", "2369.11", "2556.24", "2259.38", "16949.15"
Benchmark script
#!/bin/bash
runs=3
declare -a versions=("3.2" "4.0" "5.0-rc4")
declare -a ops=("PING"
"SUNION s10"
"SMEMBERS s10"
"SUNION s1000"
"SMEMBERS s1000"
"SUNION s1000 s10"
"SINTER s1000 s10")
# Print header row
echo -n "\"Run\", \"Version\""
for op in "${ops[@]}"; do
echo -n ", \"$op\""
done
echo
for run in `seq $runs`; do
for ver in "${versions[@]}"; do
echo -n "\"$run\", \"$ver\""
cid=`docker run -d -p 6379:6379 redis:$ver`
# Populate the database
redis-cli SADD s10 `seq -s " " -f "foo%G" 10` > /dev/null
redis-cli SADD s1000 `seq -s " " -f "foo%G" 1000` > /dev/null
# Bench it
for op in "${ops[@]}"; do
rep=`redis-benchmark -n 10000 --csv $op`
IFS=',' read -ra res <<< "$rep"
echo -n ", ${res[1]}"
done
# Tear down
docker kill $cid > /dev/null
docker rm $cid > /dev/null
echo
done
done
Note: Somehow PING is slow now... gremlins in my attic?
Comment From: filipecosta90
@itamarhaber @raunakb94 given the above results, should this issue be closed? Or should we take this chance to find opportunities for improvement here?