The new DocumentRetriever
should be the interface for all implementations that can retrieve documents, including vector stores. So I think it's reasonable for VectorStore
to extend from DocumentRetriever
.
Comment From: ThomasVitale
I like the suggestion, thanks for this! It would make it possible to design more advanced RAG workflows with retrieval from hybrid search or web search. I wonder if it would be beneficial to rethink also the DocumentRetriever
interface a bit so to accept a query as a SearchRequest
instead of just plain String
. Thoughts?
Comment From: markpollack
yes, to accept a SearchRequest
would enable to strategize how the document is retrieved vs passing in just a plain string and then hardcoding the search algorithm to be similarity search. We wil address a rewrite/review of the VectorStore interface for M4. See https://github.com/spring-projects/spring-ai/pull/1227/ for related work (I felt that adding single method shortcuts was not a good idea in that PR but otherwise ,adding additional common built in search options.
Once that is in the core vector store interface, a ReRanking advisor can take advantage of it.
Comment From: ThomasVitale
It might even help decoupling the DocumentRetriever
from the VectorStore
object, introducing a VectorStoreDocumentRetriever
(which internally would delegate to a VectorStore
). That would be similar to the Indexes defined in LlamaIndex and allow to have Retriever
objects for specific contexts rather than having to pass metadata filters on every call. I'm working on some experiments to research possible designs that would help building advanced RAG workflows. I'll share them soon.
Comment From: ThomasVitale
We are introducing a new modular RAG architecture in Spring AI. As part of that work, the DocumentRetriever
API has been revamped and it's now the main entry point for retrieving similar documents in a RAG workflow. A VectorStoreDocumentRetriever
implementation has been introduced in https://github.com/spring-projects/spring-ai/pull/1604, decoupling the retrieval step in RAG from the specific storage type.
Comment From: markpollack
this has been done in the commit https://github.com/spring-projects/spring-ai/commit/5d8c032bb706a04876687cb801f98b3f84417e50