After update to the most recent version the similarity search gives an exception because the Document is no longer compatible to the data return by the search hits 1. Document (org.springframework.ai.document.Document) is directly serialized from the ES response using standard jackson map to object serialization
doSimilaritySearch(SearchRequest searchRequest)
...
SearchResponse<Document> res = this.elasticsearchClient.search(
sr -> sr.index(this.options.getIndexName())
.knn(knn -> knn.queryVector(EmbeddingUtils.toList(vectors))
.similarity(finalThreshold)
.k((long) searchRequest.getTopK())
.field("embedding")
.numCandidates((long) (1.5 * searchRequest.getTopK()))
.filter(fl -> fl.queryString(
qs -> qs.query(getElasticsearchQueryString(searchRequest.getFilterExpression()))))),
----> Document.class);
return res.hits().hits().stream().map(this::toDocument).collect(Collectors.toList());
- The previous Document class was compatible with ES responses
"_source":{
"embedding":[…],
"content": "...",
"media":[],
"metadata":{ .... }
Now, it isn´t anymore in the Document class, after #1794. context has been renamed to text, media is no longer an array
- Therefore I get the error
com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot deserialize value of type `org.springframework.ai.model.Media` from Array value (token `JsonToken.START_ARRAY`)
at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 8428] (through reference chain: org.springframework.ai.document.Document["media"])
I'm not able to use ElasticSearch queries through the Vector Store.
Comment From: jcgouveia
@ThomasVitale , your change in Document broke the ElasticSearch doSimilaritySearch method because the Document class was used directly for deserializing responses and it is not compatible anymore (see above). I think a compatible ESDocument should be used and then mapped to the new one in the toDocument method. Why didn't anyone address this issue ? It´s not causing problems to other users of ElasticSearch queries ?
Comment From: ThomasVitale
@jcgouveia thanks for reporting this issue. The change of mine you referenced (#1794) introduces a new score
field in the Document
class, which maintains the backward compatibility for the Elasticsearch vector store integration.
The Jackson exception you got seems to be due to a different change that happened in the Document
class: the type of the media
field was changed from a Collection<Media>
to a Media
(in #1883). During deserialisation, Jackson fails turning a media
JSON array into a something that is not an array/collection.