I'm using WebClient in a setup with - org.springframework:spring-webflux:6.1.4 - io.micrometer:micrometer-tracing-bridge-otel:1.2.3 - io.opentelemetry:opentelemetry-exporter-zipkin:1.31.0

and am missing status/error metadata in the spans that show up in Zipkin.

When a HTTP GET request fails, the outcome in zipkin is a http get span with (among others) the tags: - exception: WebClientRequestException - outcome: UNKNOWN - status: CLIENT_ERROR - otel.library.name, otel.library.version, otel.scope.name, and otel.scope.version

What is missing is both: - error (Convention for Zipkin to mark the span in red) - otel.status_code (from io.opentelemetry.exporter.zipkin.OtelToZipkinSpanTransformer)

Working backwards, the OtelToZipkinSpanTransformer would set both of these, if there was a status set on the OpenTelemetry Span / SpanData, which was not the case in debugging.

These would be set in io.micrometer.tracing.otel.bridge.OtelSpan::error, which is an implementation of io.micrometer.tracing.Span. Going further back, that method is called by io.micrometer.tracing.handler.TracingObservationHandler, either when the ERROR tag is set, or when onError is called.

That should be called by io.micrometer.observation.SimpleObservation#notifyOnError, which is an implementation detail of io.micrometer.observation.SimpleObservation#error.

But it turns out WebClient doesn't call that function! Instead, at https://github.com/spring-projects/spring-framework/blob/7f0ab22c4761c1ab5c57066adcd6178b1d203131/spring-webflux/src/main/java/org/springframework/web/reactive/function/client/DefaultWebClient.java#L474 it only calls org.springframework.web.reactive.function.client.ClientRequestObservationContext#setError, whereas SimpleObservation both sets the error in the contex and dispatches the notification:

@Override
public Observation error(Throwable error) {
    this.context.setError(error);
    notifyOnError();
    return this;
}

To me this last part seems like the best place to fix this issue, i.e. to replace .doOnError(observationContext::setError) with something like .doOnError(e -> observation.error(e)).

Comment From: Xiphoseer

Note: technically it's io.micrometer.tracing.handler.PropagatingSenderTracingObservationHandler not TracingObservationHandler, which is provided by Spring Boot in MicrometerTracingAutoConfiguration, but the mechanism stays the same.

Comment From: Xiphoseer

I protoyped a fix locally with the following bean, which works to make zipkin recognize the span as failed, but outcome is still UNKNOWN, which is probably fine.

@Bean
@ConditionalOnMissingBean
@ConditionalOnClass(ClientRequestObservationContext.class)
@Order(SENDER_TRACING_OBSERVATION_HANDLER_ORDER)
public PropagatingSenderTracingObservationHandler<?> propagatingSenderTracingObservationHandler(Tracer tracer, Propagator propagator) {
    return new PropagatingSenderTracingObservationHandler<>(tracer, propagator) {
        @Override
        public void customizeSenderSpan(SenderContext context, Span span) {
            // Fix for: https://github.com/spring-projects/spring-framework/issues/32389
            if (ClientRequestObservationContext.class.isAssignableFrom(context.getClass())) {
                Throwable error = context.getError();
                if (error != null) {
                    span.error(error);
                }
            }
            super.customizeSenderSpan(context, span);
        }
    };
}

Comment From: bclozel

Thanks @Xiphoseer for the report and analysis, this has been fixed in 6.1.x and backported to 6.0.x as well. It will be shipped with the next set of releases on March 14th.