Spring-ai Refactor Separation of embedding logic through the DocumentTransformer

This PR aims to achieve two objectives through the proposed changes:

Separate the common embedding logic present in the VectorStore implementations using the DocumentTransformer interface. By isolating the logic that adds embedding data before inserting Documents into the VectorStore, maintainability and testability are improved.
Improve batch processing performance by executing blocking operations asynchronously. Sequential and synchronous Embedding Request tasks are executed on a separate Scheduler using Reactor, leading to enhanced performance.”

https://github.com/spring-projects/spring-ai/blob/10e1e13fa204b2f634ee874fcee2360f94f18185/vector-stores/spring-ai-weaviate-store/src/main/java/org/springframework/ai/vectorstore/WeaviateVectorStore.java#L327

In the example code, the map operation synchronously performs the next task only after the previous task has been completed.

https://github.com/spring-projects/spring-ai/blob/10e1e13fa204b2f634ee874fcee2360f94f18185/spring-ai-core/src/main/java/org/springframework/ai/embedding/EmbeddingModel.java#L55-L62

The call method synchronously requests an EmbeddingResponse object, creating a significant bottleneck due to the sequential execution of these blocking methods.

For comparison, when embedding and inserting the same 100 Document objects into a vector database, the original code took 106 seconds.

Screenshot 2024-08-19 at 12 57 38 AM

https://github.com/spring-projects/spring-ai/blob/eb58cf4d0a4e0f1f1cd51a10dfa595315513f4fe/spring-ai-core/src/main/java/org/springframework/ai/transformer/DocumentEmbeddingTransformer.java#L49-L59

To decrease this bottleneck, the code internally uses Reactor objects to execute these blocking methods asynchronously, minimizing the need for major code modifications.

Screenshot 2024-08-19 at 1 01 24 AM

And, after modifying the code to process the tasks on a separate asynchronous scheduler, the execution time was reduced to 8.6 seconds, representing a 92% decrease in processing time.

This PR aimed to optimize performance with minimal changes to the existing code. However, in the long term, I think that expressing the ETL pipeline as a stream rather than batch processing through a List would be more appropriate.

I have created an issue( #1219 ) related to this topic. I would appreciate any insights or thoughts you might have.

It would be great if you could take a look at the issue when you have time.

Thanks 🧑🏼‍💻

Comment From: markpollack

review in light of https://github.com/spring-projects/spring-ai/commit/087de16cfc4f6e2d646ebaafeadf45140ee75752