Bug description
When adding a document to ChromaVectorStore
, a NullPointerException
is thrown if the add was successful. The document ends up in Chroma, but application is left to handle the NPE.
When a document is successfully added, the body of the response is simply null
(I've confirmed this by using curl
to POST documents to the upsert endpoint). But the code at https://github.com/habuma/spring-ai/blob/main/vector-stores/spring-ai-chroma/src/main/java/org/springframework/ai/chroma/ChromaApi.java#L309-L315 tries to bind that to a Boolean
, but then it returns null
to the caller which then tries to use it as a Boolean
value...then you get a NullPointerException
. More completely, here's the first few lines of the stack trace I got:
java.lang.NullPointerException: Cannot invoke "java.lang.Boolean.booleanValue()" because "success" is null
at org.springframework.ai.vectorsore.ChromaVectorStore.add(ChromaVectorStore.java:106) ~[spring-ai-chroma-store-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at com.example.aichat.UploadController.upload(UploadController.java:49) ~[classes/:na]
I also suspect that it will throw a different exception (possibly ClassCastException
) if the addition fails, although I've not recreated that case. If the document add fails, you get a JSON error structure back...something like: {"error":"IndexError('list index out of range')"}
. I suspect that will fail because it won't be able to parse that JSON into a Boolean.
Environment Spring AI version 0.8.0-SNAPSHOT
Steps to reproduce
Simply call the add()
method on ChromaVectorStore
.
Expected behavior
If the document add is successful, the method should not throw any exception. If it is unsuccessful, it should throw a RuntimeException
, as described in the code.
Comment From: tzolov
@habuma, By specification the upsert
endpoint is expected to return a boolean: https://docs.trychroma.com/js_reference/Collection#returns-8
Furthermore the existing IT is successfully verifying this behaviour: https://github.com/spring-projects/spring-ai/blob/2f891243501d77bdd620d27419c2f0e07dfa0db5/vector-stores/spring-ai-chroma/src/test/java/org/springframework/ai/chroma/ChromaApiIT.java#L88-L90
So I'm puzzled by the statement?
(I've confirmed this by using curl to POST documents to the upsert endpoint) Could it be that we are running different Chroma version (the ITs are based on the chroma:0.4.15)?
Comment From: habuma
I'm running 0.4.22. (Started by Spring Boot's Docker Compose support via a docker-compose.yaml).
I just now used curl
to POST a few broken JSON files to the upsert endpoint and get responses like this one:
{"error":"IndexError('list index out of range')"}
(In this case, I had two IDs, but only 1 document and 1 set of embeddings in the body.)
If there's a JSON error (e.g., a missing comma) I get a response like this:
{"detail":[{"type":"json_invalid","loc":["body",21],"msg":"JSON decode error","input":{},"ctx":{"error":"Expecting ',' delimiter"}}]}
But, if the body that I'm posting is correct and if the document is added successfully, I get this:
null
Per the Swagger documentation at http://localhost:8000/docs, the response for a success should be an any
. The response for a validation error should be a JSON response similar to what I show above for a JSON error. (The Swagger doc doesn't mention the other JSON response, but I can assure you that I get it.)
Moreover, what even led me to this was much higher level. I have the following code that exhibits what I'm seeing:
@Bean
ApplicationRunner go(VectorStore vectorStore) {
return args -> {
try {
vectorStore.add(List.of(new Document("99999", "Rich Purnell is a steely eyed missile man.", Map.of("test", "doc"))));
} catch (Exception e) {
System.err.println("Error adding documents: " + e);
// ignore errors because of problem with Chroma vector store
}
};
}
When run, I see this in the console:
Error adding documents: java.lang.NullPointerException: Cannot invoke "java.lang.Boolean.booleanValue()" because "success" is null
Comment From: habuma
Followup: I changed my docker-compose.yaml to force 0.4.15 and I do get true
when a document is successfully upserted. So, something changed between 0.4.15 and 0.4.22.
The comments on https://github.com/chroma-core/chroma/issues/1466 mention this behavior, saying it was introduced in 0.4.16 and suggesting that it is intentional, not a bug.
Comment From: tzolov
@habuma thanks for clarifying this. Maybe they did some breaking changes in the API. Will investigate further with the 0.4.22
Comment From: tzolov
@habuma i can confirm that Chroma has changed their API, so that the upsert
doesn't return boolean anymore.
Their JS client docs still claims that it is possible , but the Python counterpart suggest None as return value.
Also noticed another change in the getEmbedding as well. The where filter doesn't doesn't support the complex operators (like with the query) but only simple matches.
A fix for the above issues will follow shortly.