Describe the bug
We found a potential data race on config.previous_tick in redis-benchmark using a static analysis tool.
We are not 100% sure this is a real race, but after manually reviewing the report it looks potentially real so we decided to open this issue just in case.
To reproduce
The race occurs if two threads are able to call showThroughput in parallel.
https://github.com/redis/redis/blob/e01c92a5ef6127067922f74cd8dac2ba72a1ad52/src/redis-benchmark.c#L1648-L1650
config.previous_tick is read at line 1648 and written to at 1650. If this function can be executed in parallel these accesses could cause a data race.
Threads spawned by startBenchmarkThreads can call:
execBenchmarkThread -> aeMain -> aeProcessEvents -> processTimeEvents.
Inside of processTimeEvents the function pointer te->timeProc is called.
https://github.com/redis/redis/blob/e01c92a5ef6127067922f74cd8dac2ba72a1ad52/src/ae.c#L319
Our analysis shows that te->timeProc can resolve to showThroughput. If this is true multiple threads may execute showThroughput in parallel causing a data race on config.previous_tick.
It looks like when benchmark threads are created, a time event is added where timeProc is set to showThroughput, but I am not sure about any future time events.
https://github.com/redis/redis/blob/e01c92a5ef6127067922f74cd8dac2ba72a1ad52/src/redis-benchmark.c#L1018-L1024
Expected behavior
If the data race can occur, should the access to config.previous_tick be made atomic?
Additional information
For reference I've pasted the report from our tool below:
==== Found a race between:
line 1650, column 5 in redis-benchmark.c AND line 1648, column 64 in redis-benchmark.c
Static variable:
config at line 130 of redis-benchmark.c
130|} config;
shared field: ->previous_tick
Thread 1 (write):
1648| const float instantaneous_dt = (float)(current_tick-config.previous_tick)/1000.0;
1649| const float instantaneous_rps = (float)(requests_finished-previous_requests_finished)/instantaneous_dt;
>1650| config.previous_tick = current_tick;
1651| atomicSet(config.previous_requests_finished,requests_finished);
1652| int printed_bytes = printf("%s: rps=%.1f (overall: %.1f) avg_msec=%.3f (overall: %.3f)\r", config.title, instantaneous_rps, rps, hdr_mean(config.current_sec_latency_histogram)/1000.0f, hdr_mean(config.latency_histogram)/1000.0f);
>>>Stack Trace:
>>>main
>>> startBenchmarkThreads [redis-benchmark.c:1800]
>>> pthread_create [redis-benchmark.c:969]
>>> execBenchmarkThread [redis-benchmark.c:969]
>>> aeMain [redis-benchmark.c:1044]
>>> aeProcessEvents [ae.c:488]
>>> processTimeEvents [ae.c:458]
>>> showThroughput [ae.c:319]
Thread 2 (read):
1646| const float dt = (float)(current_tick-config.start)/1000.0;
1647| const float rps = (float)requests_finished/dt;
>1648| const float instantaneous_dt = (float)(current_tick-config.previous_tick)/1000.0;
1649| const float instantaneous_rps = (float)(requests_finished-previous_requests_finished)/instantaneous_dt;
1650| config.previous_tick = current_tick;
>>>Stack Trace:
>>>main
>>> startBenchmarkThreads [redis-benchmark.c:1800]
>>> pthread_create [redis-benchmark.c:969]
>>> execBenchmarkThread [redis-benchmark.c:969]
>>> aeMain [redis-benchmark.c:1044]
>>> aeProcessEvents [ae.c:488]
>>> processTimeEvents [ae.c:458]
>>> showThroughput [ae.c:319]
Comment From: madolson
It doesn't seem possible to me, I believe only one thread will ever be executing showThroughput() at a given time. Since this is the reporting thread, it doesn't need to be executed on multiple threads.
Comment From: BradSwain
We reviewed the report more closely and also found this to be a false positive. Thank you for taking the time to review this issue even though this turned out to be a false report. We appreciate your time and will be more careful about reporting issues in the future.