Describe the bug

We are developing a static data race detection tool and detected a potential data race in redis version 6.0.7 We found a potential race on server.master_repl_offset between the main thread and an IO thread.

To reproduce

==== Found a race between: 
line 4988 in server.c AND line 164 in replication.c
Shared variable:
 at line 72 of server.c
 72|struct redisServer server; /* Server global state */
Thread 1:
 4986| {
 4987|     memcpy(server.replid,rsi.repl_id,sizeof(server.replid));
>4988|     server.master_repl_offset = rsi.repl_offset;
 4989|     /* If we are a slave, create a cached master from this
 4990|      * information, in order to allow partial resynchronizations*/
>>>Stacktrace:
>>>main
>>>  loadDataFromDisk [server.c:5272]
Thread 2:
 162|    unsigned char *p = ptr;
 163|
>164|    server.master_repl_offset += len;
 165|
 166|    /* This is a circular buffer, so write as much data we can at every*/
>>>Stacktrace:
>>>pthread_create [networking.c:3016]
>>>  IOThreadMain [networking.c:3016]
>>>    readQueryFromClient [networking.c:2979]
>>>      processInputBuffer [networking.c:1996]
>>>        processGopherRequest [networking.c:1886]
>>>          lookupKeyRead [gopher.c:53]
>>>            lookupKeyReadWithFlags [db.c:146]
>>>              expireIfNeeded [db.c:101]
>>>                propagateExpire [db.c:1311]
>>>                  replicationFeedSlaves [db.c:1233]
>>>                    feedReplicationBacklog [replication.c:270]

Inside of main in server.c, the function InitServerLast creates IOThreads which can eventually call feedReplicationBacklog and write to server.master_repl_offset through the call stack shown for thread 2: IOThreadMain -> readQueryFromClient -> ... -> replicationFeedSlaves -> feedReplicationBacklog

Meanwhile, immediately after the main thread finishes spawning IOThreads and returns from InitServerLast, it calls loadDataFromDisk which also writes to server.master_repl_offset.

We did not see any synchronization preventing these two accesses from happening in parallel, but we are not entirely sure.

Could someone with more knowledge on this part of the code confirm? If there is some synchronization we missed, we would like to update our tool accordingly.

Comment From: oranagra

@BradSwain thanks for submitting this interesting and informative issue.

I suppose it's simply wrong for the IO threads to do lookupKeyRead (on behalf of processGopherRequest) directly from within the IO thread (not by the main thread).

@redis/core-team how do you feel about trimming this useless feature from the code base?

Comment From: ShooterIT

I think It is truly a problem, maybe we remove Gopher , or maybe add more check when enable Gopher, like

diff --git a/src/networking.c b/src/networking.c
index 71e30cfa1..914c4e339 100644
--- a/src/networking.c
+++ b/src/networking.c
@@ -1879,7 +1879,7 @@ void processInputBuffer(client *c) {
             if (processInlineBuffer(c) != C_OK) break;
             /* If the Gopher mode and we got zero or one argument, process
              * the request in Gopher mode. */
-            if (server.gopher_enabled &&
+            if (server.gopher_enabled && !server.io_threads_do_reads &&
                 ((c->argc == 1 && ((char*)(c->argv[0]->ptr))[0] == '/') ||
                   c->argc == 0))
             {

Comment From: ShooterIT

@oranagra do you decide to remove Gopher or fix by a easy way like above?

Comment From: oranagra

I would remove it if it doesn't serve any real purpose i don't think this joke justifies making redis more complex. @redis/core-team please share your thoughts. @ShooterIT The fix you suggested is simple, so you can issue a PR to "fix" it while we're waiting for a decision.. (removing a feature, even a useless one is not an easy decision). But then the next question would be if we need to document that gopher is disabled when enabling io threads?

Comment From: ShooterIT

@oranagra OK. My fix is not real bug-fix, it just avoids problems by limiting factors, I also don't want to make redis complex in processCommand just because of Gopher.

I think we need to document that event though it is not used widely.

Comment From: oranagra

@ShooterIT I guess you can make a PR, and mention it in the Gopher section in redis.conf