Affects: org.springframework:spring-web:5.1.15.RELEASE
XML webservices seem to be limited to unicode encodings. Using jackson-dataformat-xml 2.9.8 for a webservice which produces "text/xml;charset=ISO-8859-1" outputs utf-8.
The problem might be located in AbstractJackson2HttpMessageConverter:
...
protected void writeInternal(Object object, @Nullable Type type, HttpOutputMessage outputMessage) throws IOException, HttpMessageNotWritableException {
MediaType contentType = outputMessage.getHeaders().getContentType();
JsonEncoding encoding = getJsonEncoding(contentType);
JsonGenerator generator = this.objectMapper.getFactory().createGenerator(outputMessage.getBody(), encoding);
...
JsonEncoding only provides unicode. A workaround would be to override the whole method but it is quite lenghty.
I made a very small reproducer: https://github.com/saimonsez/spring-jackson-xml-encoding-problem
Comment From: poutsma
The underlying issues seems to be that AbstractJackson2HttpMessageConverter and subclasses do not check the media type encoding if asked if they can read or write a given media type. As a result, the converter reports that it can write (for instance) "application/json;charset=ISO-8859-1", but in practice writes the default charset (UTF-8).
As you eluded to Jackson does not support non-unicode encodings and we cannot change that, but we can fix the issue above. This way, you can use a different XML converter (such as the Jaxb2RootElementHttpMessageConverter, which does support non-unicode charsets).
Comment From: saimonsez
Regarding Jacksons support of non-unicode, please also see https://github.com/FasterXML/jackson-dataformat-xml/issues/315.
Comment From: poutsma
Regarding Jacksons support of non-unicode, please also see FasterXML/jackson-dataformat-xml#315.
Thank you, I will keep an eye on that.
For now, my recommendation would be to use a different XML message converter for non-unicode charsets, such as the aforementioned Jaxb2RootElementHttpMessageConverter, which does support other encodings. To do so, you would have to override configureMessageConverters in your WebMvcConfigurationSupport configuration class (see here), to make sure that the JAXB message converter comes before the Jackson XML converter.
When the fix for this has been released, it will no longer be necessary to override configureMessageConverters, because the Jackson converters will no longer claim they support non-unicode encodings. You would still have to override extendMessageConverters in your configuration class to add the JAXB converter though, because it is not added by default.
Comment From: poutsma
This issue also occurs in the Jackson codecs (i.e. Jackson2CodecSupport and subclasses), so I will fix it there as well.
Comment From: poutsma
Fixed. Once Jackson's ToXmlGenerator is no longer hardcoded to using unicode (https://github.com/FasterXML/jackson-dataformat-xml/issues/315), we can consider using an OutputStreamWriter instead of the current OutputStream/JsonEncoding combination when invoking the Jackson mapper.
Comment From: nikomiranda
Is there a way to go back to the behavior before 5.2.7 ? I have a client that send ;charset=ISO-8859-1 but I want to read/write in UTF-8. Somehow force to ignore the charset.
Comment From: poutsma
Is there a way to go back to the behavior before 5.2.7 ? I have a client that send ;charset=ISO-8859-1 but I want to read/write in UTF-8. Somehow force to ignore the charset.
The only way I can think of is to write a MVC interceptor or HTTP servlet filter to change the request character set for the URL and HTTP method the client uses.
Comment From: nikomiranda
Is there a way to go back to the behavior before 5.2.7 ? I have a client that send ;charset=ISO-8859-1 but I want to read/write in UTF-8. Somehow force to ignore the charset.
The only way I can think of is to write a MVC interceptor or HTTP servlet filter to change the request character set for the URL and HTTP method the client uses.
I am not sure how to modify a header of HttpServletRequest using the MVC interceptor or a Filter without wrapping the request. It's possible? The private method readJavaType reads the charset from the Http request Content-Type.