Description
The context.Keys
is designed with safety to concurrent read and write, as context.mu
field said
// This mutex protects Keys map.
mu sync.RWMutex
But there is a magical reality: Keys
is public but mu
is private. It means anyone could concurrent access Keys
but without
the protect from mu
. It's a very common use case in APM that user wants to visit the map to log contexts.
Even that full keys visiting is an undefined behavior, it's also not protected in context
:
1. (*Context).Copy
cp.Keys = map[string]any{}
for k, v := range c.Keys {
cp.Keys[k] = v
}
It will cause concurrent iteration and write
(*Context).Set
c.Keys = nil
It will cause assignment to entry in nil map
How to reproduce
package main
import (
"fmt"
"github.com/gin-gonic/gin"
)
func copyContextMiddleware() gin.HandlerFunc {
return func(context *gin.Context) {
context.Copy().Next()
}
}
func visitKeysMiddleware() gin.HandlerFunc {
return func(context *gin.Context) {
for k, v := range context.Keys {
fmt.Println(k, v)
}
context.Next()
}
}
func setupRouter() *gin.Engine {
r := gin.New()
r.Use(copyContextMiddleware())
r.Use(visitKeysMiddleware())
r.GET("/ping", func(c *gin.Context) {
go func() {
for i := 0; i < 1000; i++ {
c.Set("foo", "bar")
}
}()
})
return r
}
func main() {
r := setupRouter()
r.Run(":8080")
}
Expectations
$ curl http://localhost:8080/ping
HTTP/1.1 200 OK
But actually, if we use jmeter or other stress testing tools, it will panic
Environment
- go version: 1.18+
- gin version (or commit ref): cover all versions
- operating system: Macbook M2 Pro (dev), Linux (prod)
Comment From: araujo88
I've thought of a GetKeys()
function which would provide a thread-safe access to c.Keys
:
func (c *Context) GetKeys() map[string]interface{} {
c.mu.RLock()
defer c.mu.RUnlock()
keysCopy := make(map[string]interface{}, len(c.Keys))
for k, v := range c.Keys {
keysCopy[k] = v
}
return keysCopy
}
However, the problem of Keys
being public still remains and making it private would incur in breaking changes.
Comment From: jizhuozhi
Hello @araujo88 , thanks for your reply. In my opinion, there is two simple ways to provide safety access of context.Keys
1. provide a public method Range(func(k string, v any) bool)
which is concurrent safety
2. make mu
public too, users have responsibility to make it concurrent safety
Both of them will not take breaking changes into gin
Comment From: araujo88
@jizhuozhi, I personally prefer the first option. A Range(func(k string, v any) bool)
would look something like:
func (c *Context) Range(f func(k string, v any) bool) {
c.mu.RLock()
defer c.mu.RUnlock()
for k, v := range c.Keys {
if !f(k, v) {
break
}
}
}
So what is left to the user is only the implementation a function that returns false to break out of the loop. This approach keeps the concurrency control logic encapsulated within the library, making it easier for users to write correct, concurrent code.
The second option, while it solves the immediate problem, places the responsibility for managing concurrency on the users of the library. Making the mutex public increases the complexity for them and introduces more room for error.
Comment From: jizhuozhi
But I also prefer the second option, because there are many users using context.Keys
without protect, they need a faster way to fix it
Comment From: araujo88
So provide both options is the way, I guess?
Comment From: jizhuozhi
(*Context).Copy
LGTM.
And I will protect Keys
in Copy
, because Copy
could be invoked in any place inside or outside of request lifetime. As for reset
, I think it as the beginning of request lifetime, it should never have data race, I will not protect it.
Comment From: araujo88
Copy()
does provide the functionality you need, however it is not optimal because it adds some overhead by copying unnecessary fields.
Comment From: jizhuozhi
Copy()
does provide the functionality you need, however it is not optimal because it adds some overhead by copying unnecessary fields.
Copy
is ideal, but actually, business codes depend on the context to share memory (in APM or other infra level middlewares). Infra developers could not predict when would user makes context escape, so for infra developers, they are only two choice: copy context at the beginning of request lifetime or make safety to be the responsibility of business developers. Because infra developers could not make sure their middlewares always at the beginning of request lifetime and no other middlewares before them always depend on context to share memory, they will choose the second.
Comment From: chenxiaolong01
@RedCrazyGhost I've noticed that the latest master branch has already fixed this issue. When will the release be available?
Comment From: appleboy
@chenxiaolong01 I will release the next version v1.10.0 asap.