Description
gin.Context
implements context.Context
. In my opinion, it should follows context.Context package's thread safety guaranties.
However gin doc states, that gin.Context
is valid only in a request scope. And one should call Copy
to avoid pitfalls.
Copy returns a copy of the current context that can be safely used outside the request's scope. This has to be used when the context has to be passed to a goroutine.
But consider following example.
gn.GET("/", func(ctx *gin.Context) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, remote.URL, nil)
if err != nil {
ctx.AbortWithStatus(500)
return
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
ctx.AbortWithStatus(500)
return
}
discardResp(resp)
ctx.String(http.StatusOK, "ok")
})
This should work according to the documentation and general expectations. However, with ContextWithFallback
we would get following result on context cancellation:
WARNING: DATA RACE
Write at 0x00c00026c020 by goroutine 306:
github.com/gin-gonic/gin.(*Engine).ServeHTTP()
/Users/nick/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/gin.go:586 +0xe7
net/http.serverHandler.ServeHTTP()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/server.go:3142 +0x2a1
net/http.(*conn).serve()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/server.go:2044 +0x13c4
net/http.(*Server).Serve.gowrap3()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/server.go:3290 +0x4f
Previous read at 0x00c00026c020 by goroutine 234:
github.com/gin-gonic/gin.(*Context).hasRequestContext()
/Users/nick/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:1220 +0x1b3
github.com/gin-gonic/gin.(*Context).Value()
/Users/nick/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:1263 +0x24a
context.value()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/context/context.go:787 +0x34d
context.(*valueCtx).Value()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/context/context.go:756 +0x8d
net/http/httptrace.ContextClientTrace()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/httptrace/trace.go:25 +0x4a7
net/http.(*persistConn).readLoop()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/transport.go:2152 +0x423
net/http.(*Transport).dialConn.gowrap2()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/transport.go:1799 +0x33
Goroutine 306 (running) created at:
net/http.(*Server).Serve()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/server.go:3290 +0x8ec
net/http/httptest.(*Server).goServe.func1()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/httptest/server.go:310 +0xbb
Goroutine 234 (finished) created at:
net/http.(*Transport).dialConn()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/transport.go:1799 +0x2773
net/http.(*Transport).dialConnFor()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/transport.go:1485 +0x124
net/http.(*Transport).queueForDial.gowrap1()
/Users/nick/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.darwin-amd64/src/net/http/transport.go:1449 +0x44
I won't dive deeply into reasoning. Default http library leaks context to other gorouting outside of request scope. The main problem is that programmer shouldn't know implementation details of http lib! It should just work!
Consider other example in the code below. Most of the time we can't even call Copy
in some deep nested code.
So, now the only way of safe usage of gin.Context
is to call Copy
on every conversion to the context.Context
. But this makes the whole point of implementing context.Context
useless.
But even this is not always possible in middleware/generated code world. So we are abandoning usage of gin in out tech stack.
But I hope this will help to improve gin, as it is truly good http framework.
My proposals:
1. Either disable context pooling when ContextWithFallback is enabled.
2. Or stop implementing context.Context
and provide explicit GetContext/MakeContext
methods, that will return threadsafe context. As this is the breaking change, consider it for gin/v2
How to reproduce
go test -race -bench=.
package tests
import (
"context"
"errors"
"io"
"math/rand/v2"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/gin-gonic/gin"
)
func BenchmarkGinDataRace(b *testing.B) {
remote := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(randDuration(100*time.Millisecond, 500*time.Millisecond))
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, "ok")
}))
defer remote.Close()
makeSomeInnerWork := func(ctx context.Context) error {
//can't copy here
req, err := http.NewRequestWithContext(ctx, http.MethodGet, remote.URL, nil)
if err != nil {
return err
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
return err
}
discardResp(resp)
return nil
}
thisMaybeDeepnested := func(ctx context.Context) error {
var value struct{}
cxt := context.WithValue(ctx, "my", value)
return makeSomeInnerWork(cxt)
}
gin.SetMode(gin.ReleaseMode)
gn := gin.New()
gn.ContextWithFallback = true
gn.GET("/", func(ctx *gin.Context) {
err := thisMaybeDeepnested(ctx)
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
//ok
err = nil
}
if err != nil {
ctx.AbortWithStatus(500)
b.Error(err)
return
}
ctx.String(http.StatusOK, "ok")
})
gn.GET("/direct", func(ctx *gin.Context) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, remote.URL, nil)
if err != nil {
ctx.AbortWithStatus(500)
b.Error(err)
return
}
resp, err := http.DefaultClient.Do(req)
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
//ok
err = nil
}
if err != nil {
ctx.AbortWithStatus(500)
b.Error(err)
return
}
discardResp(resp)
ctx.String(http.StatusOK, "ok")
})
ginServer := httptest.NewServer(gn)
defer ginServer.Close()
testCall := func() {
ctx := context.Background()
// our client doesn't want to wait for long
ctx, cancel := context.WithTimeout(ctx, randDuration(0, 200*time.Millisecond))
defer cancel()
req, err := http.NewRequestWithContext(ctx, http.MethodGet, ginServer.URL, nil)
if err != nil {
b.Error(err)
return
}
resp, err := http.DefaultClient.Do(req)
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
return
}
if err != nil {
b.Error(err)
return
}
discardResp(resp)
if resp.StatusCode != http.StatusOK {
b.Fail()
}
}
b.RunParallel(func(pb *testing.PB) {
for pb.Next() {
testCall()
}
})
}
func randDuration(from, to time.Duration) time.Duration {
delta := to - from
return from + time.Duration(rand.Int64N(int64(delta)))
}
func discardResp(resp *http.Response) {
if resp == nil {
return
}
_, _ = io.Copy(io.Discard, resp.Body)
_ = resp.Body.Close()
}
Expectations
Just works
Actual result
DATA RACE as above
Environment
- go version: go1.22.7 darwin/amd64
- gin version: v1.10.0
Comment From: jerryyummy
sync.Pool is not a cache in the true sense. It might be more appropriate to call it a "recycle bin". The data put into it is logically considered to have been deleted, but physically, the data still exists. This data can survive for two rounds of garbage collection time. i tested the code and i guess this will fail in high concurrency situation?
it seems that one goroutine has finished and the other get the same context and want to set sth, but if the routine has finished it should has returned the context to the pool so that the next can get the context
Comment From: TelpeNight
So the only way to avoid this - not using Pool when ContextWithFallback is enabled. In any other situation Context may leak. Synchronization around Request won't work, as leaky goroutine will access Context of another Request, not original one. When ContextWithFallback is not set, problem still may occur with Value() method. But this is mutch more rare case. Most goroutines are interested in Done/Err
Comment From: TelpeNight
We've encountered crash in prod env because of this stuff(