Affects: Spring 6.0.3
Emojis are converted to Unicode escape sequences
When returning objects out of @RestController
, all UTF-8 characters from the object's fields are interpreted correctly, but emojis are devoured from "👾" to "\uD83D\uDC7E".
Minimal Reproducible Example
- Create Spring Boot project:
curl -G https://start.spring.io/starter.zip -d dependencies=web -d javaVersion=11 -d type=maven-project -o demo.zip
- Create
.\src\main\java\com\example\demo\rest\MainRest.java
:
package com.example.demo.rest;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.bind.annotation.GetMapping;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.util.HashMap;
import java.util.Map;
@RestController
public class MainRest {
@GetMapping("/greeting0")
Map<String,String> greet() {
var map = new HashMap<String,String>();
map.put("content","Οὐχὶ ταὐτὰ παρίσταταί 0 👾");
return map;
}
@GetMapping("/greeting1")
String greet(ObjectMapper objectMapper) throws Exception {
var map = new HashMap<String,String>();
map.put("content","Οὐχὶ ταὐτὰ παρίσταταί 1 👾");
return objectMapper.writeValueAsString(map);
}
}
- Run application and call these endpoints:
curl "http://localhost:8080/greeting0"
curl "http://localhost:8080/greeting1"
Expected output:
{"content":"Οὐχὶ ταὐτὰ παρίσταταί 0 👾"}
{"content":"Οὐχὶ ταὐτὰ παρίσταταί 1 👾"}
Actual output:
{"content":"Οὐχὶ ταὐτὰ παρίσταταί 0 \uD83D\uDC7E"}
{"content":"Οὐχὶ ταὐτὰ παρίσταταί 1 👾"}
This is Spring Boot, but same behaviour can be reproduced in bare Spring MVC with this setting:
<mvc:annotation-driven>
<mvc:message-converters>
<bean id="jacksonHttpMessageConverter"
class="org.springframework.http.converter.json.MappingJackson2HttpMessageConverter">
</bean>
</mvc:message-converters>
</mvc:annotation-driven>
and with produces="text/plain;charset=UTF-8"
in @GetMapping("/greeting1")
.
Jackson-databind works fine not only in the example above (when called directly), but in standalone application with no other dependencies as well. So I assume it's something with Spring MVC, not with Jackson or Boot.
Comment From: simonbasle
You're encountering this issue: https://github.com/FasterXML/jackson-core/issues/223.
The AbstractJackon2HttpMessageConverter
, used in the case where the controller method returns a Map
, doesn't merely call writeValueAsString
. It uses a Jackson JsonGenerator
created out of an OutputStream
(the one for the body of the response), and Jackson encodes the emoji as a "surrogate pair" with the \u
escape (a strict interpretation of the spec).
On the other hand, internally, ObjectMapper#writeValueAsString
uses a JsonGenerator
created out of a Writer
.
As far as I understand, Java deals with the unicode there and treats the emoji as a valid Unicode character.
Note that curl
seems to have difficulties rendering the escaped surrogate pair, but other tools might properly deal with it. For instance, httpie
shows the emoji in both cases.
If you know that you have emojis in the JSON you produce and that possible clients MUST include ones that don't deal with the surrogate pair well, then I'm afraid the only solution is to use the ObjectMapper
yourself and writeValueAsString
or writeValueAsBytes
. And/or upvote the Jackson issue.