This PR aims to achieve two objectives through the proposed changes:
- Separate the common embedding logic present in the VectorStore implementations using the DocumentTransformer interface. By isolating the logic that adds embedding data before inserting Documents into the VectorStore, maintainability and testability are improved.
- Improve batch processing performance by executing blocking operations asynchronously. Sequential and synchronous Embedding Request tasks are executed on a separate Scheduler using Reactor, leading to enhanced performance.”
https://github.com/spring-projects/spring-ai/blob/10e1e13fa204b2f634ee874fcee2360f94f18185/vector-stores/spring-ai-weaviate-store/src/main/java/org/springframework/ai/vectorstore/WeaviateVectorStore.java#L327
In the example code, the map operation synchronously
performs the next task only after the previous task has been completed.
https://github.com/spring-projects/spring-ai/blob/10e1e13fa204b2f634ee874fcee2360f94f18185/vector-stores/spring-ai-weaviate-store/src/main/java/org/springframework/ai/vectorstore/WeaviateVectorStore.java#L363-L368
https://github.com/spring-projects/spring-ai/blob/10e1e13fa204b2f634ee874fcee2360f94f18185/spring-ai-core/src/main/java/org/springframework/ai/embedding/EmbeddingModel.java#L55-L62
The call
method synchronously
requests an EmbeddingResponse object, creating a significant bottleneck due to the sequential execution of these blocking methods.
For comparison, when embedding and inserting the same 100 Document objects into a vector database, the original code took 106 seconds.
https://github.com/spring-projects/spring-ai/blob/eb58cf4d0a4e0f1f1cd51a10dfa595315513f4fe/spring-ai-core/src/main/java/org/springframework/ai/transformer/DocumentEmbeddingTransformer.java#L49-L59
To decrease this bottleneck, the code internally uses Reactor objects to execute these blocking methods asynchronously, minimizing the need for major code modifications.
And, after modifying the code to process the tasks on a separate asynchronous scheduler, the execution time was reduced to 8.6 seconds, representing a 92% decrease in processing time.
This PR aimed to optimize performance with minimal changes to the existing code. However, in the long term, I think that expressing the ETL pipeline as a stream rather than batch processing through a List would be more appropriate.
I have created an issue( #1219 ) related to this topic. I would appreciate any insights or thoughts you might have.
It would be great if you could take a look at the issue when you have time.
Thanks 🧑🏼💻
Comment From: markpollack
review in light of https://github.com/spring-projects/spring-ai/commit/087de16cfc4f6e2d646ebaafeadf45140ee75752