This issue will track the first part of the adoption of the Modular RAG architecture in Spring AI.

Context

The work can be split into two categories.

  1. The design and implementation of the building blocks for RAG modules that will be done in the org.springframework.ai.rag package. Each component is well encapsulated and can be used by itself to compose any kind of RAG flow. For example, you can use such components to build your own RAG flows using Spring State Machine, Spring Cloud Function or Spring Cloud Data Flow.
  2. Using the Advisor API, a RetrievalAugmentationAdvisor will be implemented to provide some out-of-the-box RAG flows using the building blocks defined in the previous category.

For background information, please refer to the following sources:

Design

Modules

In this first part, the focus will be on establishing the following modules and submodules:

  • Pre-Retrieval
  • Query Transformation https://github.com/spring-projects/spring-ai/pull/1703
  • Query Expansion https://github.com/spring-projects/spring-ai/pull/1703
  • Retrieval
  • Document Search https://github.com/spring-projects/spring-ai/pull/1604
  • Document Join https://github.com/spring-projects/spring-ai/pull/1767
  • Post-Retrieval (only interfaces)
  • Document Ranking https://github.com/spring-projects/spring-ai/pull/1767 (a)
  • Document Selection https://github.com/spring-projects/spring-ai/pull/1767 (a)
  • Document Compression https://github.com/spring-projects/spring-ai/pull/1767 (a)
  • Augmentation
  • Content https://github.com/spring-projects/spring-ai/pull/1644
  • Orchestration (experimental)
  • Query Routing https://github.com/spring-projects/spring-ai/pull/1767

This issue will be updated as we progress with the design and implementation work.

Advisor

In this first part, the focus will be on:

  • Establishing a RetrievalAugmentationAdvisor.
  • Adopting the available sub-modules to support Naive and Advanced RAG flows.
  • Experimental branching and conditional components.

Untitled-2024-10-24-1557

Comment From: LuizyHub

Hello @ThomasVitale

Thank you for opening this issue and providing detailed context about the Modular RAG architecture.

While exploring the current documentation, I noticed that QuestionAnswerAdvisor is still described as the advisor for RAG flows (e.g., in the Spring AI documentation). However, this issue outlines the creation of a new RetrievalAugmentationAdvisor and its integration with modular components.

Could you elaborate on the reasoning behind moving away from QuestionAnswerAdvisor? Is it primarily about achieving better modularity, or are there limitations in the existing implementation that the new approach aims to address?

Additionally, if I missed something in the issue description regarding the specific differences or advantages of the new advisor, I'd appreciate any clarification.

Thank you for your work on this!

Comment From: ThomasVitale

@LuizyHub thanks for the interest! The Modular RAG work is still in progress for now. Once it's ready, we will provide detailed documentation and guidance on how to use these new components, including information on how the QuestionAnswerAdvisor is different and the reasoning behind this new solution. Some initial part has been included in the M4 release and it's described here: https://spring.io/blog/2024/11/20/spring-ai-1-0-0-m4-released#advanced-and-modular-rag Please, notice that this is still in preview/experimental.

I have some examples here: https://github.com/ThomasVitale/llm-apps-java-spring-ai?tab=readme-ov-file#-retrieval-augmented-generation-rag If you give it a try, we would be happy to receive any feedback to improve the solution before reaching a final version for the GA release. Thank you!

Comment From: ThomasVitale

The scope in this part 1 has been included in the M4 release. Next steps will be covered in https://github.com/spring-projects/spring-ai/issues/1811, so I'm closing this issue.

Comment From: kevintsai1202

Can the changes to Modular RAG be updated in the document? I couldn't find these changes or example code in the official documentation.