Hi, I am trying to use the Ollama EmbeddingModel
with PgVectorStore
and it's failing with
Caused by: org.springframework.jdbc.UncategorizedSQLException: StatementCallback; uncategorized SQLException for SQL [CREATE INDEX IF NOT EXISTS spring_ai_vector_index ON vector_store USING HNSW (embedding vector_cosine_ops)
]; SQL state [XX000]; error code [0]; ERROR: column cannot have more than 2000 dimensions for hnsw index
at org.springframework.jdbc.core.JdbcTemplate.translateException(JdbcTemplate.java:1549) ~[spring-jdbc-6.1.8.jar:6
the same application works with OpenAI
here's a sample https://github.com/joshlong/bootiful-spring-boot-2024/tree/main/service
and there's a PgVector Docker Compose file here https://github.com/joshlong/bootiful-spring-boot-2024/blob/main/compose.yaml
help, please
Comment From: ThomasVitale
@joshlong unfortunately, Ollama has a fixed 4096 size for the embeddings and it's not currently possible to customise the value. There is a feature request to do that: https://github.com/ollama/ollama/issues/651 I hope Ollama addresses this issue asap!
I have opened a PR to improve the Spring AI PGvector documentation and mention the limitation of the HNSW indexing strategy, which can't support dimensionality above 2000: https://github.com/spring-projects/spring-ai/pull/825/files
There is also an ongoing discussion in the PGvector project to raise the limit to at least 4096: https://github.com/pgvector/pgvector/issues/461.
My current workaround to get applications running fully locally is to use Ollama for the chat model and one of the ONXX Transformers for the embedding model: https://docs.spring.io/spring-ai/reference/api/embeddings/onnx.html
Comment From: markpollack
Thanks @ThomasVitale Closing the issue
Comment From: ThomasVitale
In case it's useful to people landing on this issue and having the same problem, I'll share another tip to use Ollama for embeddings paired with PGVector.
The multi-purpose models in the Ollama library (like mistral
and llama3.1
) seem to have been configured with 4096 dimensionality, which cannot be changed as of now, and doesn't work with PGVector.
However, dedicated embedding models like nomic-embed-text
(full list here: https://ollama.com/search?q=&c=embedding) are configured with different dimensionality. In many cases, it's lower than 2000, so they all can be used with PGVector (and Spring AI lets you configure the dimensions via spring.ai.vectorstore.pgvector.dimensions
).
Comment From: smitchell
@joshlong - I did "ollama pull all-minim" then used this in my properties file:
ai:
ollama:
base-url: http://${OLLAMA_HOST}:11434
chat:
model: llama3.1:8b
embedding:
enabled: true
model: all-minilm
You can find the code here: https://github.com/ByteworksHomeLab/spring-ai-lab
Comment From: joshlong
Thanks all
Comment From: Hyun-June-Choi
@smitchell
like below?
ai:
ollama:
base-url: http://${OLLAMA_HOST}:11434
chat:
model: llama3.1:8b
embedding:
enabled: true
model: nomic-embed-text