Spring Current Observation.Context missing from WebClient request

Spring Boot: 3.1.5 Spring Framework: 6.0.13

In Spring 3 micrometer integration has been reworked to use Micrometer 1.10 Observation. It works in the default setup but in case there is a filter (ExchangeFilterFunction) that changes the underlining request, metric is reported with incorrect tags.

It happens because current implementation is initializing ClientRequestObservationContext using the original ClientRequest

https://github.com/spring-projects/spring-framework/blob/e12269ef1ac332e192a02ac58eab8e6b62f3b7f3/spring-webflux/src/main/java/org/springframework/web/reactive/function/client/DefaultWebClient.java#L455

and any changes to the request after this point are not visible to the observation context https://github.com/spring-projects/spring-framework/blob/e12269ef1ac332e192a02ac58eab8e6b62f3b7f3/spring-webflux/src/main/java/org/springframework/web/reactive/function/client/DefaultClientRequestObservationConvention.java#L92

Similar scenario was supported in the previous version by adding custom filters before MetricsWebClientFilterFunction that reported the same metric before.

Here is a test to reproduce the above issue. In this test original request uses POST http method but filter is changing is to GET. The metric is still reported with the original POST method.

iterable contents differ at index [3], expected: <tag(method=GET)> but was: <tag(method=POST)>
Expected :tag(method=GET)
Actual   :tag(method=POST)

@SpringBootTest(
        properties = {"management.metrics.tags.service-name=test"}
)
class WebClientObservationTest {
    @Autowired
    private WebClient.Builder webClientBuilder;

    @Autowired
    private MeterRegistry meterRegistry;

    @Test
    void httpClientRequestsMetrics() {
        var webClient = webClientBuilder
                .filter(new ClientRequestExchangeFilter())
                .build();

        var req = webClient.post()
                .uri("http://api.zippopotam.us/us/{zip}", 98121)
                .retrieve()
                .bodyToMono(String.class);

        StepVerifier.create(req)
                .assertNext(res -> {
                    assertNotNull(res);
                })
                .verifyComplete();

        Meter meter = meterRegistry.find("http.client.requests")
                .meter();
        assertNotNull(meter);

        List<Tag> expectedTags = Arrays.asList(
                Tag.of("client.name", "api.zippopotam.us"),
                Tag.of("error", "none"),
                Tag.of("exception", "none"),
                Tag.of("method", "GET"),
                Tag.of("outcome", "SUCCESS"),
                Tag.of("service-name", "test"),
                Tag.of("status", "200"),
                Tag.of("uri", "/us/{zip}")
        );
        assertIterableEquals(expectedTags, meter.getId().getTags());
    }

    static class ClientRequestExchangeFilter implements ExchangeFilterFunction {
        @Override
        public Mono<ClientResponse> filter(ClientRequest request, ExchangeFunction next) {
            ClientRequest newRequest = ClientRequest.from(request)
                    .method(HttpMethod.GET)
                    .build();
            return next.exchange(newRequest);
        }
    }

    @TestConfiguration
    public static class MockConfiguration {
        @Bean
        public MeterRegistry simpleMeterRegistry() {
            return new SimpleMeterRegistry();
        }
    }
}

Comment From: bclozel

Thanks for reaching out.

I realize that this provides less flexibility than the former MetricsWebClientFilterFunction, but this instrumentation is now built-in and this fixes numerous issues we had in the past.

Now we could consider moving the instrumentation after the ExchangeFilterFunction processing, but this would cause other problems:

the processing time of filter functions would not be measured - this time can be often not negligible for the application
filter functions can perform error handling operations and retries transparently for the user. This time should be measured for the metric
filter functions can do logging and send authentication requests to fetch tokens; the observation should be active at that point otherwise this won't be part of the global trace.
the instrumentation here is meant to be applied at the client level, not the underlying HTTP library. In this context I think that the request given to the client better expresses the intent (even if the request can be later mutated or if multiple requests can happen).

This instrumentation is similar to the current arrangement in RestTemplate since the processing of ClientHttpRequestInterceptor is included in the observation. We've already improved the situation there in #30247 by ensuring that the observation wraps all functions and still get the response result.

In short, I don't know how to make this work without breaking those assumptions or somehow going back to a more complex setup where the observation belongs to a separate filter function that you need to configure and order yourself relative to other filters.

Am I missing something?

Comment From: amrynsky

@bclozel I agree with the above points, but the limitation here is that observation context for the request is immutable and, as result, it's impossible to use changes made by filters in reported metrics.

I'm still trying to understand the new observation lifecycle but here are several scenarios from our system that are not working after migration: - use http request headers added by filters as tags in metrics - use key/values from the reactive context as tags in metrics - add attributes to the WebClient request using filters and then use as tags in metrics

From what I see https://github.com/spring-projects/spring-framework/blob/e12269ef1ac332e192a02ac58eab8e6b62f3b7f3/spring-webflux/src/main/java/org/springframework/web/reactive/function/client/DefaultClientRequestObservationConvention.java#L96 is called after filters (observation.stop()) but it relies on the request that was initialized before. Probably we can update request in the ObservationFilterFunction together with response in ClientRequestObservationConvention.

Comment From: bclozel

I'm still trying to understand the new observation lifecycle but here are several scenarios from our system that are not working after migration

Reading the use cases you've described, I think it would make more sense to get the current observation from your filter function and then add keyvalues directly to it. You should get it from the reactor Context or better, from a request attribute.

I see we're not exposing the current observation right now to the filter functions and I would consider this a bug.

Ideally, something like this:

class CustomFilterFunction implements ExchangeFilterFunction {

    @Override
    public Mono<ClientResponse> filter(ClientRequest request, ExchangeFunction next) {
        ClientRequestObservationContext.findCurrent(request)
                .ifPresent(context -> context.addLowCardinalityKeyValue(KeyValue.of("key", "value")));
        return return next.exchange(request);
    }
}

is called after filters (observation.stop()) but it relies on the request that was initialized before.

It's not entirely true. The lifecycle is a bit more complex. Micrometer uses io.micrometer.observation.ObservationHandler implementations to bridge from the Observation API to the metrics and traces. For metrics, the timer is started at the beginning and the keyvalues are collected on stop. But for tracing, the keyvalues can be collected early and we also add a tracing header to the original request. In short, even if we change the request mid-flight, it might not affect the tracing information. This is why we need to set the request in the observation context as early as possible - without that, we cannot add the tracing information to it.

Let me know what you think about this.

Comment From: amrynsky

Getting access to the current observation context from the filter is a good idea and definitely make sense, but adding metrics directly to the context will create unnecessary coupling. Currently context is just request/response holder and has no knowledge on the underlining metrics. In addition, in our case, we adding an additional tag to the built-in metric http.client.requests . To do it we just define custom ObservationConvention that extends DefaultClientRequestObservationConvention

       @Override
    public KeyValues getLowCardinalityKeyValues(ClientRequestObservationContext context) {
        return super.getLowCardinalityKeyValues(context)
                .and(
                        fromHeader(context.getRequest(), HTTP_HEADER_TENANT, "tenant")
                );
    }

    private static KeyValue fromHeader(ClientRequest request, String header, String key) {
        if (request == null) {
            return KeyValue.of(key, KeyValue.NONE_VALUE);
        }

        return Optional.ofNullable(request.headers().getFirst(header))
                .map(value -> KeyValue.of(key, value))
                .orElse(KeyValue.of(key, KeyValue.NONE_VALUE));
    }

Basically we just need to pass some addition context from the filter to the ObservationConvention. Something like this should work

static class ClientRequestExchangeFilter implements ExchangeFilterFunction {
    @Override
    public Mono<ClientResponse> filter(ClientRequest request, ExchangeFunction next) {
        return Mono.deferContextual(context -> {
            ClientRequestObservationContext.findCurrent(request)
                    .ifPresent(observation -> observation.attributes().put("tenant", context.get("tenant")));

            return next.exchange(request);
        });
    }
}

and then access attributes() downstream (similar to WebClient attributes).

Another option is to update the request in the ObservationFilterFunction https://github.com/spring-projects/spring-framework/blob/7b14606d92bfe36fea048137e794cfff18e9e42a/spring-webflux/src/main/java/org/springframework/web/reactive/function/client/DefaultWebClient.java#L746C43-L746C43

@Override
public Mono<ClientResponse> filter(ClientRequest request, ExchangeFunction next) {
    return next.exchange(request)
            .doOnNext(response -> {
                observationContext.setRequest(request);
                observationContext.setResponse(response);
            });
}

Comment From: bclozel

Getting access to the current observation context from the filter is a good idea and definitely make sense, but adding metrics directly to the context will create unnecessary coupling. Currently context is just request/response holder and has no knowledge on the underlining metrics. In addition, in our case, we adding an additional tag to the built-in metric http.client.requests.

ClientRequestObservationContext extends Observation.Context, so you are in fact coupled to low and high cardinality KeyValues. The Observation context is never coupled to the metrics themselves as those are extracted from the context by Observation handlers.

Basically we just need to pass some addition context from the filter to the ObservationConvention. Something like this should work

There is no attributes method on the observation itself, this is really the goal of the Observation.Context. If this is an acceptable solution to you, why not use the ClientRequest#attributes in the first place? The request is itself immutable, but the attributes are not.

Another option is to update the request in the ObservationFilterFunction

I think we should eliminate this solution completely. We've seen how this can lead to inconsistent observations, leading to metrics being very different from traces. Implementations can do that through the context if they chose so, but I don't think we should make this easier and lead developers to this situation.

At this stage, I'm considering putting the ClientRequestObservationContext as a request attribute and/or putting the current observation in the reactor context. Those are acceptable changes but I'm not sure they're actually required for your use case since a request attribute might be enough.

Comment From: amrynsky

There is no attributes method on the observation itself, this is really the goal of the Observation.Context.

I know that attributes is not supported on the observation context. It was just a proposal following suggestion to get current context from the filter

If this is an acceptable solution to you, why not use the ClientRequest#attributes in the first place? The request is itself immutable, but the attributes are not.

This is exactly the issue. ClientRequest represents a typed, immutable, client-side HTTP request and, as far as I see, we can't add any additional state to it without cloning the original request ClientRequest newRequest = ClientRequest.from(request) in filter. Just tried and underlining map is UnmodifiableMap

Suppressed: java.lang.UnsupportedOperationException
        at java.base/java.util.Collections$UnmodifiableMap.put(Collections.java:1505)
        at com.example.demo.WebClientObservationTest$ClientRequestExchangeFilter.filter(WebClientObservationTest.java:72)

Comment From: bclozel

Right, I forgot that the attributes map was also made immutable. We'll add the client Observation.Context as a request attribute then, this is the most sensible and straightforward solution.

Comment From: amrynsky

@bclozel thx for the update but still not sure how to pass additional state to ClientRequestObservationContext. do you plan to add custom observation KeyValues in a separate PR?

Comment From: bclozel

No this change should be enough - you can add key values like this: https://github.com/spring-projects/spring-framework/issues/31609#issuecomment-1814111143

Comment From: amrynsky

let me check what is the lifecycle of the lowCardinalityKeyValue. The original question was how to add additional tags to the built-in http.client.requests metric and not how to create a new one

Comment From: bclozel

I have described the lifecycle in https://github.com/spring-projects/spring-framework/issues/31609#issuecomment-1814111143

Adding low cardinality KeyValues to the observation will add tags to the metric.

Comment From: bclozel

Unfortunately we had to remove this feature in #32198 because it was linked with reported memory leaks.