Hi,
Bug description I am building an app using the RAG process. I have loaded some private documents and I am able to have a conversation with the AI model (I am using Open AI) about the data in the private documents and I am receiving satisfactory responses. The right data is being retrieved from the Vector database based on my query prompts (User Message). But, when I ask a question which is completely outside the context of the documents loaded in the Vector database, then I am still receiving a response; in-spite of no documents being retrieved from the Vector database.
The AI model sometimes gives answers from it's own knowledge and not from the context.
For e.g., when I ask a question like 'How many planets are there in our Solar System', I still get an answer from the AI model even if 0 documents (from the vector database) is passed into it as a System prompt. Any reason why this is happening? Is there a way I can prevent the AI model from answering questions outside of the documents context?
Environment I am using Spring AI version - 0.8.1, Java 21, and using a PG vector store.
Steps to reproduce Load some PDFs to the vector database and follow the RAG technique. Have a clear system prompt. This is my system prompt.
You are a helpful assistant, conversing with a user about the subjects contained in a set of documents.
Use the information from the DOCUMENTS section to provide accurate answers. If unsure or if the answer
isn't found in the DOCUMENTS section, simply state that you don't know the answer.
And do not answer to any question that is not related to the context provided in the DOCUMENTS section.
DOCUMENTS:
{documents}
This is my code for the Vector similarity search.
private Message generateSystemMessage(String message) {
LOGGER.info("Retrieving documents");
List<Document> similarDocuments = vectorStore.similaritySearch(SearchRequest.query(message)
.withTopK(2).withSimilarityThreshold(0.75));
LOGGER.info("Found {} similar documents", similarDocuments.size());
SystemPromptTemplate systemPromptTemplate = new SystemPromptTemplate(this.systemPromptResource);
if(similarDocuments.isEmpty()) {
return systemPromptTemplate.createMessage(Map.of("documents", "No information found"));
}
String documentContent = similarDocuments.stream().map(Document::getContent).collect(Collectors.joining("\n"));
return systemPromptTemplate.createMessage(Map.of("documents", documentContent));
}
Expected behavior When I ask a question which has no context whatsoever w.r.t the documents loaded and the context provided in the System prompt, I am still able to get an answer. If you ask any general knowledge related question, you will get an answer. I am expecting the AI model to not answer any question that is outside the context of the documents loaded.
These are the logs
From the logs, it is clear the similarity search from the vector database resulted in 0 documents.
2024-03-13T21:26:17.172+11:00 INFO 25298 --- [io-8080-exec-10] c.example.rag.qa.StreamingChatService : Retrieving documents
2024-03-13T21:26:17.714+11:00 INFO 25298 --- [io-8080-exec-10] c.example.rag.qa.StreamingChatService : Found 0 similar documents
2024-03-13T21:26:17.716+11:00 INFO 25298 --- [io-8080-exec-10] c.example.rag.qa.StreamingChatService : The system prompt is -- You are a helpful assistant, conversing with a user about the subjects contained in a set of documents.
Use the information from the DOCUMENTS section to provide accurate answers. If unsure or if the answer
isn't found in the DOCUMENTS section, simply state that you don't know the answer.
And do not answer to any question that is not related to the context provided in the DOCUMENTS section.
DOCUMENTS:
No information found
Comment From: iAMSagar44
Created a separate system prompt template for the flow when the search results from the Vector store is empty (see an example below) and it worked as expected.
You are a helpful assistant. You do not have answers to the question being asked. Respond back with the following {message}