This was a design decision way back taken from how langchain (python) was designed. In practice, this doesn't seem useful. It acts like a sort of cache, if you are writing the same document to the vector store multiple times, which doesn't seem realistic. If you have a use case for keeping the embedding in the Document, please comment.
Comment From: ThomasVitale
The SimpleVectorStore
relies on the presence of the "embedding" field in Document
.
Comment From: iAMSagar44
The AzureVectorStore
also relies on the presence of the "embedding" field in Document. The doAdd method has the following line of code in the 1.0.0-M4 version of Spring AI.
searchDocument.put(EMBEDDING_FIELD_NAME, document.getEmbedding());
Comment From: ilayaperumalg
The SimpleVectorStore relies on the presence of the "embedding" field in Document.
This PR https://github.com/spring-projects/spring-ai/pull/1822 addresses this.
Comment From: markpollack
We need to review all the vector store impls and make appropriate modifications. I noticed the Neo4jVectorStore also uses the 'getEmbeddings' method as sort of simple data structure to pass things along while adding. It can be refactored.
Comment From: ThomasVitale
If possible, it would be great to get https://github.com/spring-projects/spring-ai/pull/1794 merged before start removing the "embedding" field, mostly to avoid many many merge conflicts since they touch the same code area 😅
Comment From: ilayaperumalg
We need to review all the vector store impls and make appropriate modifications. I noticed the Neo4jVectorStore also uses the 'getEmbeddings' method as sort of simple data structure to pass things along while adding. It can be refactored.
Created https://github.com/spring-projects/spring-ai/issues/1826 to track this.