Spring Boot 2.7.3

1) Spring MVC + Mustache = UTF-8 not working 2) Add CharacterEncodingFilter = UTF-8 works 3) Add Spring Security + config Mustache = UTF-8 display works, sending - no 4) Change CharacterEncodingFilter to OrderedCharacterEncodingFilter = UTF-8 works

Repository demonstrating the issue + commit history.

Old similar problem - https://github.com/spring-projects/spring-boot/issues/3912

Comment From: wilkinsona

Thanks for the sample. Unfortunately, I've been unable to reproduce the problem using the first commit in the repository. If I disable the auto-configuration of the OrderedCharacterEncodingFilter by setting server.servlet.encoding.enabled=false, UTF-8 submitted by the form is corrupted while UTF-8 served directly from the mustache template remains correct.

I am on macOS which uses UTF-8 by default, but configuring the JVM to use US-ASCII (verified using Charset.getDefaultCharset()) makes no difference to the behaviour. What do I need to do so that the provided sample will reproduce the problem?

Comment From: spring-projects-issues

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

Comment From: wilkinsona

@lksh5737 As I described above, that doesn't match the behavior that I have observed. If you or @Dmitrii-Iakovenko can provide a sample that shows the problem, we can take another look.

Comment From: wilkinsona

@lksh5737 Thanks, but that does not reproduce the problem for me either. It works fine for me on macOS, even when I force the JVM to use US-ASCII as its default encoding. As I asked above, what do I need to do so that the provided sample will reproduce the problem?

Comment From: wilkinsona

That doesn't seem to make a difference:

Korean text rendered correctly on Windows 10

Comment From: wilkinsona

Perhaps you can debug your application with Spring Boot 2.6 where it works and Spring Boot 2.7 where it does not to find the difference? Unfortunately, we're unlikely to be able to help any further without knowing how to reproduce the problem.

Comment From: wilkinsona

It's hard to offer specific guidance as we don't know where the problem is. I would step through the code using your IDE's debugger to try to identify the point at which the UTF-8 characters are corrupted. The rendering of the MustacheView that's done in renderMergedTemplateModel(Map<String, Object>, HttpServletRequest, HttpServletResponse) may be a good place to start.

Comment From: spring-projects-issues

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

Comment From: spring-projects-issues

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open the issue.

Comment From: swhyeon98

https://github.com/swhyeon98/mustache_with_springboot_test.git

Can you find any errors on the above sample?

There are two solutions for this issue.

you can add the following tags to the application.properties file: server.servlet.encoding.charset=UTF-8 server.servlet.encoding.force=true

Alternatively, you could downgrade your Spring Boot version to 2.6.

Both methods have solved the issue.

The followings are photos of the test run in different environments.

Untitled

error1

error2

sol1

sol2

Comment From: wilkinsona

@swhyeon98 Unfortunately, I cannot reproduce the problem with your sample either. I've tried on macOS using its default encoding (UTF-8) and also configured to use windows-1252 (-Dfile.encoding=cp1252). In both cases the UTF-8 text is returned:

$ curl -i localhost:8080
HTTP/1.1 200 
Content-Type: text/html;charset=UTF-8
Content-Language: en-GB
Content-Length: 239
Date: Tue, 18 Jul 2023 08:55:30 GMT

<!DOCTYPE HTML>
<html>
<head>
    <title>스프링 부트 웹서비스</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
<h1>스프링 부트로 시작하는 웹 서비스</h1>
</body>
</html>

Comment From: swhyeon98

@wilkinsona Thank you for your answer. May I ask a few more questions?

  1. did you use the IDE in your testing?
  2. Could it be a problem with the IDE?
  3. If you used an IDE, which IDE did you use? (I used IntelliJ.)
  4. If you used an IDE, could a plug-in have affected the results?

Comment From: wilkinsona

I ran it in Eclipse and on the command line. On the command line, I built the jar and then ran it using java -jar. The results were the same in both cases, both with and without file.encoding=cp1252. I've just tried it in IntelliJ, again with and without file.encoding=cp1252, and the problem did not occur there either.

Comment From: swhyeon98

@wilkinsona Thank you very much.

I will try to find the difference between you and me and the cause (java version, etc.).

If I find the cause, can I comment again then?

Comment From: wilkinsona

If I find the cause, can I comment again then?

Yes, of course. Please do. We'd really like to understand what's going on as I'm sure it'll help others.

Comment From: swhyeon98

I could not find any difference between you and me, but I think I found the cause of the problem.

When I checked in developer mode - network, I saw that Content-Tpye shows ISO-8859-1. Content-Type

So I checked the ViewResolver on Mustache's end, The result of the check mustache_config

if (mustache.getServlet().getContentType() ! = null) { resolver.setContentType(mustache.getServlet().getContentType().toString()); { } It seems to occur because the above code only sends up to text/html on the ViewResolver side. mustache_getContentType

You can check this part in my previous sample.

I still haven't found the difference between you and me and why it's fixed when I downgrade SpringBoot.

How do you see the output when you check developer mode - network?

Comment From: wilkinsona

How do you see the output when you check developer mode - network?

It's the same as it was with curl: text/html;charset=UTF-8

Looking at the headers in your request that's receiving an ISO-8859-1 response led me to try sending a request with an Accept-Language header. By default in Chrome, the header's value is en-GB,en-US;q=0.9,en;q=0.8 for me and curl doesn't send one at all by default. If I send the same header as your browser, I can reproduce the problem:

$ curl -i localhost:8080 -H "Accept-Language: ko-KR,ko;q=0.9"
HTTP/1.1 200 
Content-Type: text/html;charset=ISO-8859-1
Content-Language: ko-KR
Content-Length: 193
Date: Thu, 31 Aug 2023 10:54:47 GMT

<!DOCTYPE HTML>
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>??? ?? ????</title>
</head>
<body>
<h1>??? ??? ???? ? ???</h1>
</body>
</html>

I'm not yet sure why it has the effect that it does, but it gives me something to investigate.

Comment From: wilkinsona

The problem is due to Tomcat's mapping of locales to charsets.

Without an Accept-Language header, the default locale is used. On my machine this is en_GB which is mapped to UTF-8. With an Accept-Language header of Accept-Language: ko-KR,ko;q=0.9 the locale is ko_KR. Tomcat has no charset mapping for this locale. This results in the default character encoding being used and the Servlet spec states that this is ISO-8859-1.

Tomcat provides a flag that can be used to disable its enforcement of the default character encoding but it doesn't really help. Not with curl at least. With the org.apache.catalina.connector.Response.ENFORCE_ENCODING_IN_GET_WRITER system property set to false, the Content-Type changes to omit the charset, but the UTF-8 characters are still corrupted:

curl -i localhost:8080 -H "Accept-Language: ko-KR,ko;q=0.9"
HTTP/1.1 200 
Content-Type: text/html
Content-Language: ko-KR
Content-Length: 193
Date: Thu, 31 Aug 2023 11:24:59 GMT

<!DOCTYPE HTML>
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>??? ?? ????</title>
</head>
<body>
<h1>??? ??? ???? ? ???</h1>
</body>
</html>

Instead, the problem can be addressed by mapping the ko locale to UTF-8:

server.servlet.encoding.mapping.ko=UTF-8

This property is equivalent to the <locale-encoding-mapping-list> entry in web.xml.

Without this custom mapping, the same problem occurs with Jetty as it defaults to ISO-8859-1 as required by the servlet spec. Interestingly, it does not occur with Undertow which uses UTF-8 regardless.

While server.servlet.encoding.mapping is listed in the configuration properties appendix, it isn't mentioned anywhere in the rest of the reference documentation as far as I can tell. I think we should use this issue to correct that.