Since: Spring 5.2 and the deprecation of APPLICATION_JSON_UTF8
Mediatype.
Although the content type 'application/json' can only be encoded in UTF-8 (which is why the explicit encoding was deprecated), MockHttpServletResponse#getCharacterEncoding()
returns 'ISO-8859-1'.
This breaks any test code that inspects or uses the getCharacterEncoding()
directly. But more unexpectedly, it breaks MockHttpServletResponse#getContentAsString()
, because the response is decoded with ISO-8859-1 instead of UTF-8. Explicitly using MockHttpServletResponse#getContentAsString(Charset fallbackCharset)
works correctly.
The Javadoc for getContentAsString()
is (to me) somewhat ambiguous:
Get the content of the response body as a String, using the charset specified for the response by the application, either through HttpServletResponse methods or through a charset parameter on the Content-Type.
"using the charset specified for the response by the application" could imply UTF-8 for 'application/json', because that must always be the value for JSON. If using this interpretation, then the method breaks its Javadoc contract.
Previously, this was fixed for ContentResultMatchers#json(String)
but not for MockHttpServletResponse#getContentAsString()
(https://github.com/spring-projects/spring-framework/issues/23622).
Another report was closed by the submitter without any resolution except using an explicit charset (https://github.com/spring-projects/spring-framework/issues/23851).
If it is not possible to fix this for JSON specifically, I would appreciate if somehow a default value could be set (instead of always defaulting to ISO-8859-1).
Comment From: sbrannen
I would appreciate if somehow a default value could be set (instead of always defaulting to ISO-8859-1).
Are you aware of the org.springframework.mock.web.MockHttpServletResponse.setCharacterEncoding(String)
method?
Comment From: jhyot
@sbrannen Yes I saw the method, but I didn't think of a way to do it by default. I guess I could write a custom org.springframework.test.web.servlet.ResultHandler
and set it there. Thanks.
The rest of the ticket is still valid with the somewhat unexpected behaviour.
Comment From: sbrannen
@sbrannen Yes I saw the method, but I didn't think of a way to do it by default. I guess I could write a custom
org.springframework.test.web.servlet.ResultHandler
and set it there. Thanks.
You could write a custom ResultHandler
to set the character encoding for the response, but you would probably be better off setting it as early as possible -- for example, via a custom Filter
registered via .addFilter()
in the MockMvc
builder.
Note, however, that setting the character encoding via setCharacterEncoding()
or setContentType()
results in ;charset=...
being appended to the Content-Type
header, and that may be an unacceptable side effect for your use case.
In light of that, I am going to repurpose this issue to introduce support for changing the default character encoding used in MockHttpServletResponse
.
The rest of the ticket is still valid with the somewhat unexpected behaviour.
Although I can see how you might have expected getContentAsString()
to take into account the Content-Type
header, I don't think we should alter the behavior to provide special treatment for application/json
to infer UTF-8
as the character encoding since getContentAsString(Charset)
was introduced to support use cases where the character encoding used to write the response body is not reflected in the value returned from getCharacterEncoding()
.
@rstoyanchev, do you have any further input here?
Comment From: sbrannen
You could write a custom
ResultHandler
to set the character encoding for the response, but you would probably be better off setting it as early as possible -- for example, via a customFilter
registered via.addFilter()
in theMockMvc
builder.Note, however, that setting the character encoding via
setCharacterEncoding()
orsetContentType()
results in;charset=...
being appended to theContent-Type
header, and that may be an unacceptable side effect for your use case.In light of that, I am going to repurpose this issue to introduce support for changing the default character encoding used in
MockHttpServletResponse
.
This has been addressed via a new setDefaultCharacterEncoding(String characterEncoding)
method in e4b9b1fadb35c4e439acc8931910aaba6e1342e3.
@jhyot, feel free to try this out in an upcoming 5.3.10 snapshot.
Comment From: jhyot
Thank you for adding this.
Comment From: sbrannen
See #27230 for support for this new feature in MockMvc
.
Comment From: zhemaituk
@sbrannen sorry for excavating an old issue. I can log new one if it has a chance to live. I just encountered the same problem and was surprised to see the problem occurs in tests only (as browsers and rest of Spring framework defaults to UTF-8 for application/json as per RFC 4627).
Is there a chance the test framework can follow RFC 4627 out of the box? If it did - It would make tests behavior more representable of real-world behavior.
Workaround is not complex, but it felt as a "least surprise principle" violation, and I saw way too many imprecise workarounds: - some people change controllers/filters and add ;charset=utf-8 back, even though Spring stopped doing that due to RFC 4627 - some people do setDefaultCharacterEncoding("UTF-8"); or getContentAsString(StandardCharsets.UTF_8); unconditionally (potentially hiding encoding issues with other content types).
I ended up adding such code, but it felt quite wordy:
if (MediaType.APPLICATION_JSON_VALUE.equals(response.getContentType())) {
// Workaround of https://github.com/spring-projects/spring-framework/issues/27214
// The default encoding for application/json is UTF-8 per RFC 4627, and handled as UTF-8 by web browsers and spring framework (except MockMvc).
response.setDefaultCharacterEncoding("UTF-8");
}
return response.getContentAsString();