Hi, we are working on a race detection tool and we believe that we have found a potential race for this project. Although this race does not seem to be crucial to the everyday function of Redis, we thought it would be better to report the race just to be sure. This race was found on commit 8a81ed1 at the time of writing.

Description of the Bug: The variable slaveKeysWithExpire, declared at redis/src/expire.c:365 can be read/written in parallel. The variable is written to during the loadDataFromDisk function and can be read in parallel during _serverPanic. The shared variable is in line 365 in the redis/src/expire.c file: dict *slaveKeysWithExpire = NULL;

Thread 1: redis/src/expire.c

 437|         NULL                        /* allow to expand */
 438|     };
>439|     slaveKeysWithExpire = dictCreate(&dt,NULL);
 440| }
 441| if (db->id > 63) return;

Stacktrace for thread 1:

main 
  loadDataFromDisk [server.c:5894] 
    loadAppendOnlyFile [server.c:5559] 
      rdbLoadRio [aof.c:770] 
        setExpire [rdb.c:2580] 
          rememberSlaveKeyWithExpire [db.c:1413]

Thread 2: redis/src/expire.c

 458|/* Return the number of keys we are tracking. */
 459|size_t getSlaveKeyWithExpireCount(void) {
>460|    if (slaveKeysWithExpire == NULL) return 0;
 461|    return dictSize(slaveKeysWithExpire);
 462|}

Stacktrace for thread 2:

pthread_create [bio.c:121] 
  bioProcessBackgroundJobs [bio.c:121] 
    _serverPanic [bio.c:227] 
      printCrashReport [debug.c:997] 
        logServerInfo [debug.c:1837] 
          genRedisInfoString [debug.c:1580] 
            getSlaveKeyWithExpireCount [server.c:4839] 

Additional Information: Based on the stack trace of thread 2, it seems that the race will only occur if the server were to crash. This means that the likeliness of the race occurring is low. Thread 1 is spawned as the main thread and thread 2 is spawned in line 121 of the pthread_create function of the bio.c file.

Here is the full race report generated by our tool: Redis Potential Data Races Detected by Static Code Scanner

Comment From: yossigo

@josefm9 thanks for reporting this. As you indicate yourself, this is not something that's ever expected to happen as the condition that leads to it is practically a very basic programming error.

I'm curious, what tool are you using? Is it publicly available?

Comment From: josefm9

Hi @yossigo, In order to detect this race, I used a tool that I am helping develop called Coderrect Scanner. The tool uses static analysis to find races within projects and can be found here https://coderrect.com/download/. The tool is free and publicly available.

If you have any questions or suggestions while using the tool feel free to contact me at josef.m@coderrect.com