Affects: 5.3.3
Library: spring-web
The ReadOnlyHttpHeaders
type wraps an existing (mutable) MultiValueMap
. That map can be updated independently without going through the ReadOnlyHttpHeaders
interface. This makes it inappropriate to cache the Accept
and Content-Type
headers.
For example: I am working on a Spring Cloud Gateway project. Because the multipart/form-data parser does not correctly handle quoted boundary values (a separate issue) I tried to write both a GlobalFilter
and a WebFilter
that would unwrap the boundary value before I attempt to use the built-in decoders for form data. This doesn't work, though, because the original quoted value is cached in a ReadOnlyHttpHeaders
instance, even though its backing MultiValueMap
was updated by my filter.
See an example snippet below:
@Override
public Mono<Void> filter(ServerWebExchange exchange, WebFilterChain chain) {
var headers = exchange.getRequest().getHeaders();
if(headers.getContentType().isCompatibleWith(MediaType.MULTIPART_FORM_DATA)) {
LOG.trace("Examining multipart/form-data request");
// Check the boundary value and rewrite it if necessary
return chain.filter(exchange.mutate().request(req -> req.headers(original -> {
var contentTypeValue = original.getFirst("Content-Type");
LOG.trace("Original Content-Type header:" + contentTypeValue);
var matcher = BOUNDARY.matcher(contentTypeValue);
var found = matcher.find();
if(!found) {
throw new IllegalStateException("A Content-Type header must specify a boundary for a multipart/form-data request (" + contentTypeValue + ")");
}
var boundary = matcher.group("boundary");
if(boundary.startsWith("\"") && boundary.endsWith("\"")) {
//original.setContentType(new MediaType("multipart/form-data; boundary=" + boundary.substring(1, boundary.length() - 1)));
original.set("Content-Type", "multipart/form-data; boundary=" + boundary.substring(1, boundary.length() - 1));
}
LOG.trace("Modified Content-Type header:" + original.getContentType());
})).build());
}
return chain.filter(exchange);
}
The headers passed to the builder method are mutable and are intended to allow for this use case -- great! But then when I get to the actual boundary detection code in the multipart codec shipped with Spring Cloud Gateway, it does the following:
@Nullable
private static byte[] boundary(HttpMessage message) {
MediaType contentType = message.getHeaders().getContentType();
if (contentType != null) {
String boundary = contentType.getParameter("boundary");
if (boundary != null) {
return boundary.getBytes(StandardCharsets.ISO_8859_1);
}
}
return null;
}
That HttpMessage#getHeaders()
returns a ReadOnlyHttpHeaders
instance, which someone else has already called getContentType()
on before my filter ran. So I'm stuck with the 'broken' value of the boundary and my code dies. I have no other way to rewrite that request short of running a whole separate gateway instance in front of my actual gateway to rewrite that header and forward it.
It's not obvious why those two headers require caching. There are no comments describing reasons why they are particularly expensive to compute, for example.
Comment From: bclozel
See #21783 for background.
Comment From: Thorn1089
@bclozel thanks, that's useful background. Might be good to put a reference in the code to that discussion. :)
So is there a way to have our cake and eat it too? Or should it be a documentation fix that one should never attempt to change the Content-Type/Accepts headers with a WebFilter
or GlobalFilter
? (Actually, it's not even that straightforward; if I changed the header but then just forwarded the request through Cloud Gateway I imagine my modified value would be propagated -- the issue is when I first try to inspect the multipart content before forwarding...) This took a considerable amount of time to track down since the stack trace does not point to where the header is inspected initially at all.
Comment From: poutsma
Besides the specific reasons mentioned in #21783, we favor immutable data structures in WebFlux in general, because mutable variants introduce a whole new range of potential concurrency issues.
Comment From: rstoyanchev
You're currently changing the exchange and trying to modify the (immutable) request. What you need is to change the exchange, and change the request:
ServerHttpRequest updatedRequest = request.mutate().headers(headers-> ...).build();
exchange.mutate().request(updatedRequest).build();
Comment From: Thorn1089
@rstoyanchev That is incorrect. I am using .request(Builder)
which, as the Javadoc describes, is identical to what you suggest. Please reopen this issue.
Per the Javadoc:
Configure a consumer to modify the current request using a builder. Effectively this: exchange.mutate().request(builder-> builder.method(HttpMethod.PUT)); // vs... ServerHttpRequest request = exchange.getRequest().mutate() .method(HttpMethod.PUT) .build(); exchange.mutate().request(request);
Comment From: rstoyanchev
True, .request(Builder)
does work just as well. So it is possible to mutate request values and passing the modified exchange/request via chain#filter
means downstream handling should sees the mutated values.
But then when I get to the actual boundary detection code in the multipart codec shipped with Spring Cloud Gateway, it does the following
So is Spring Cloud Gateway using the original exchange somehow instead of the mutated one?
Comment From: bclozel
I think I'm getting this now.
So this is not about changing the behavior of ReadOnlyHttpHeaders
or misusing the mutation API for the exchange, but rather a bug with request mutation.
The following test case fails with the second assertion.
MockServerHttpRequest request = MockServerHttpRequest.post("https://example.com")
.contentType(MediaType.APPLICATION_JSON)
.header("some", "original")
.build();
MockServerWebExchange exchange = MockServerWebExchange.from(request);
ServerWebExchange mutated = exchange.mutate()
.request(req -> req.headers(headers -> headers.setContentType(MediaType.APPLICATION_CBOR)).build())
.request(req -> req.headers(headers -> headers.set("some", "updated")).build())
.build();
assertThat(mutated.getRequest().getHeaders().getFirst("some")).isEqualTo("updated"); // OK
assertThat(mutated.getRequest().getHeaders().getContentType()).isEqualTo(MediaType.APPLICATION_CBOR); // FAILS
Here's what's happening:
- the original exchange is created with the request; the
AbstractServerHttpRequest
turns the headers intoReadOnlyHttpHeaders
right away - during the
DefaultServerWebExchange
instantiation, we're initializing as well the form data in the request which calls thegetContentType
method on the read-only headers and caches the value - later, when building the
MutatedServerHttpRequest
we're getting headers from the original request withoriginalRequest.getHeaders()
(this returns the read-only version). HttpHeaders.readOnlyHttpHeaders(headers)
returns the input parameter if it's already a readonly type.
This means that we're copying the right header values in the mutated request but we're keeping the original read-only value.
Changing HttpHeaders
to always return return new ReadOnlyHttpHeaders(headers.headers);
fixes this particular problem, but I'm wondering if we're not just fixing the symptom rather than the bigger problem.
@rstoyanchev do you think that we've missed something with our mutability story here in the implementation hierarchy?
Comment From: Thorn1089
I think the problem is the "ownership" of the original HttpHeaders
that the read-only variant wraps. If those are effectively discarded after creating the readonly version there is probably no issue. But in this case, the mutable headers are still used, and affected by the changes in the filter chain. And for every other header besides Content-Type
and Accept
this works fine.
The problem as I see it is that ReadOnlyHttpHeaders
is doing two things -- preventing mutation (by overriding setter methods and throwing UnsupportedOperationException
) and caching expensive values (for content-type negotiation purposes it seems). The latter is not at all what I'd expect this class to do and I agree that creating a "fresh" ReadOnlyHttpHeaders
every time is just hiding the problem.
As an analogy, this would be like if Collections#unmodifiableCollection()
from the JDK not only overrode the mutation methods, but also cached the results of the underlying collection's iterator. If the mutable collection was changed, re-iterating the readonly collection should still show those changes and not cache the prior state.
Comment From: bclozel
@TomRK1089 I don't think we can frame this like any collections wrapping problem.
Here, ReadOnlyHttpHeaders
is preventing developers from modifying the request headers because the request is considered as immutable during certain phases of the exchange lifecycle (for example, when the request is mapped to a handler). If we didn't then mapping or content negotiation could be invalidated because the request has changed in the meantime - a handler could then get a request that it doesn't know how to process...
So ReadOnlyHttpHeaders
can indeed cache values for performance reasons, because we only allow mutation of the request during certain phases. I think that once you're given a read-only view of the request headers, you should be confident that it's immutable.
I've found and fixed the actual issue: instead of creating a new request using the writable view of the HTTP headers we were using during request mutation, we passed the original (so read-only) headers. Changing that solves the problem and it looks like it's aligning better with the intended design.
This also fixes a similar problem on the "Accept" headers which are also cached by the read-only version.