Reproducer details and problem explanation
A reproducer is shared at the following location : https://github.com/jackcat13/missingTraceAndSpanInAbstractErrorWebExceptionHandler
It contains a SpringBoot application that can be started locally. When reaching the endpoint /api/book, it raises an exception which triggers the class GlobalErrorWebExceptionHandler which extends AbstractErrorWebExceptionHandler
Doing it, it appears that no spanId nor traceId is printed whereas it is present in the controller itslef :
The problem appeared with SpringBoot 3.0.3 version, it was not present in previous versions. The reproducer itself is relying on version 3.1.0.
Note : Hooks.enableAutomaticContextPropagation(); statement is already called in the constructor of the SpringBootApplication class.
Thanks in advance for your time on this issue.
Comment From: GurkiratSingh37
In the handle method of GlobalErrorWebExceptionHandler, retrieve the span and add the spanId and traceId to the error response.
Span currentSpan = tracer.currentSpan();
if (currentSpan != null) {
response.getHeaders().add("spanId", currentSpan.context().spanId());
response.getHeaders().add("traceId", currentSpan.context().traceId());
}
Please give these suggestions a try and let me know if it helps resolve the issue.
Comment From: deepakraghav0
In the handle method of GlobalErrorWebExceptionHandler, retrieve the span and add the spanId and traceId to the error response.
Span currentSpan = tracer.currentSpan(); if (currentSpan != null) { response.getHeaders().add("spanId", currentSpan.context().spanId()); response.getHeaders().add("traceId", currentSpan.context().traceId()); }
Please give these suggestions a try and let me know if it helps resolve the issue.
@GurkiratSingh37 : No, it will not run, as current span is itself not present there, it is there till controller.
Comment From: wilkinsona
Thanks for the sample, @jackcat13. I've reproduced the behaviour that you have described but I'm curious about earlier versions.
The problem appeared with SpringBoot 3.0.3 version, it was not present in previous versions.
I can reproduce the problem with 3.0.3, but with 3.0.2 the sample does not compile as the Hooks.enableAutomaticContextPropagation() does not exist. How did you configure things when it worked with Spring Boot 3.0.2 and earlier?
Comment From: jackcat13
@wilkinsona It's about some automated tests that stopped working on mentioned version. But maybe I can try to setup in a separate branch in the sample project to validate or not my observations.
Comment From: jackcat13
@wilkinsona Hi again. I am able to make the observation work at runtime with the following code in the controller code (even in 3.1.0) :
Mono.deferContextual(contextView -> {
ContextSnapshot.captureAll(contextView).setThreadLocals();
});
I still have the issue in my codebase but I'm not yet able to find out the root cause.
In any case, without capturing the ContextView from reactor context, I'm not able to make it work (same in 3.0.2 version). I don't know if the automatic propagation should be able to manage it or not. What do you think ?
Thanks a lot for your help anyways :)
Comment From: wilkinsona
@jackcat13 Interesting. Thanks. I've asked the observability team to take a look.
Comment From: bclozel
This looks fairly similar to https://github.com/spring-projects/spring-framework/issues/30013. The observability instrumentation in Spring Framework 6.0 is based on a WebFilter, which doesn't wrap the error handling phase. This explains why the MDC is not restored in those phases. Because this requires a major change in the instrumentation, we've applied this for Spring Framework 6.1.0.
I'm not sure how deferContextual works around this problem, nor where it is applied. In all cases, if the root problem is the absence of observation context information in WebFlux error handling, I think this issue can be closed in favor of the Framework one.
Comment From: jackcat13
Indeed, it looks very similar. I'll give it a try once the fix is available. Thanks a lot for all your feedbacks !
Comment From: chemicL
My 5c: @jackcat13 calling
ContextSnapshot.captureAll(contextView).setThreadLocals();
creates a leak of ThreadLocal values which can be disastrous (especially in terms of security), please don't use it this way (you're leaving an open ContextSnapshot.Scope, which you never close). The solution @bclozel mentions from the framework 6.1 milestone is what you need.