JEP 400: UTF-8 by Default was included in Java 18 and later version. Basically it changes the default file.encoding to UTF-8 by default and introduced a few additional properties to identify system native encoding. However the console output charset is different:
Those APIs include the ones listed above, but not System.out and System.err, whose charset will be as specified by Console.charset().
However Spring Boot doesn't respect Console.charset(). It always use org.springframework.boot.logging.LoggingSystemProperties.getDefaultCharset() to set the default charset for console and file encoding. Depends on the implementation, it uses either hard-coded UTF-8 or Charset.defaultCharset().
Below is a simple Spring Boot application to reproduce the issue (The system locale needs to be different than English):
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
System.out.println("你好");
log.info("你好");
}
Run it with Java 17:
你好
2024-11-12T10:16:10.992+08:00 INFO 19176 --- [test] [ main] test.Application : 你好
Run it with Java 21 (my system locale is GBK):
你好
2024-11-12T10:16:28.478+08:00 INFO 524 --- [test] [ main] test.Application : 浣犲ソ
The workaround is to add properties like below. But I think a better value should be provided by default.
logging.charset.console=${stdout.encoding}
Comment From: mhalbritter
The charset has been set to Charset.defaultCharset() in https://github.com/spring-projects/spring-boot/issues/27230.
Having read the JEP, I think we should try to use System.console().charset() for the console loggers.
One thing to note is that System.console() can return null, if the process doesn't have a console (or if it's running with the IntelliJ IDEA console?!)
Flagging this for team attention to see what the rest of the team thinks.
Comment From: philwebb
23827 added the properties to set the charset. It looks like Logback uses the default charset if one is not specified. Log4J2 appears to default to UTF-8 if a charset is not specified. I guess both expect the user to actually configure things if the defaults don't work.
I think I'd be in favor of using System.console().charset() if we can and Charset.defaultCharset() if the Console is null. That would mostly align with Logback defaults. The JDK also appears to use Charset.defaultCharset() as a fallback (at least in JDK 17).
I think this could be a breaking change, and since there's a workaround I'm not sure we should do it until Boot 3.5.
Comment From: mhalbritter
Superseded by https://github.com/spring-projects/spring-boot/pull/44353.