Bug description When I create a RAG Application with SimilaritySearch, the search returns similar documents when using Azure OpenAI, but always returns zero documents with Ollama. The issue occurs specifically when I use the withSimilarityThreshold parameter with Ollama.

Environment last version of spring-ai BOM ollama with llama 3.2 Azure OpenAI PGVector

Steps to reproduce When I use withSimilarityThreshold with Ollama, I always have 0 Document in my similarDocuments (it's ok with Azure OpenAI)

   var similarity = 0.8; 
    var topk =20;
    var searchRequest = SearchRequest.query(question.question())
            .withTopK(topk)
            .withSimilarityThreshold(similarity);
     List<Document> similarDocuments = vectorStore.similaritySearch(searchRequest);

Expected behavior The search should return similar results between Azure OpenAI and Ollama. However, Ollama consistently returns zero similar documents.

Observed Behavior

Azure OpenAI: Returns a list of documents that meet the similarity threshold. Ollama: Returns zero documents, regardless of the similarity threshold set.

Additional Information

Configuration Consistency: The vectorStore configuration is identical for both platforms. Error Logs: No explicit error messages are thrown; however, the empty response from Ollama does not align with the expected output. [spring-ai-bootstraping] [nio-8080-exec-1] c.c.s.s.controller.AskRagController : Found 0 similar documents

Comment From: asaikali

What embedding model are you using with ollama?

Comment From: cjullien

I tested with the following models

spring.ai.ollama.embedding.options.model=nomic-embed-text

spring.ai.ollama.embedding.options.model=mxbai-embed-large

spring.ai.ollama.embedding.options.model=chroma/all-minilm-l6-v2-f32

spring.ai.ollama.embedding.options.model=hellord/e5-mistral-7b-instruct:Q4_0

Comment From: 77fill

look here: spring-ai/vector-stores/spring-ai-chroma-store/src/main/java/org/springframework/ai/vectorstore /ChromaVectorStore.java method: doSimilaritySearch version: 1.0.0-M4

I don't understand the condition(1 - distance) >= request.getSimilarityThreshold()

Isn't the similarity threshold between 0 and 1? What about (1-distance)? Isn't it normally negative? Is that perhaps the reason why the document list becomes empty?

Comment From: markpollack

cosine similarity measures the cosine of the angle between two vectors, indicating their directional alignment. This value ranges from -1 (exactly opposite) to 1 (exactly the same), with 0 signifying orthogonality.

Conversely, cosine distance quantifies the dissimilarity between vectors and is defined as 1 minus the cosine similarity. Therefore, cosine distance ranges from 0 (identical vectors) to 2 (diametrically opposed vectors).

pgvector returns the cosine distance, not the cosine similarity. To retrieve the cosine similarity from the result, you can subtract the returned cosine distance from 1.

note that i think we have a mistake when using other metric types, e.g. euclidian distance, in the current implementation.

https://omiid.me/notebook/32/pgvector-similarity-search-distance-functions

neverthelss, we should investigate to clear this up as i think our integration tests are not covering it.