Bug description
was aborted: ERROR: expected 1536 dimensions, not 4096 Call getNextException to see other errors in the batch.
at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2413)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:579)
at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:912)
at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:936)
at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1733)
at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:127)
at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java)
at org.springframework.jdbc.core.JdbcTemplate.lambda$getPreparedStatementCallback$6(JdbcTemplate.java:1609)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:658)
... 62 common frames omitted
Caused by: org.postgresql.util.PSQLException: ERROR: expected 1536 dimensions, not 4096
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2725)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2412)
... 70 common frames omitted
I notice that for Neo4j vector store can set the embedding dimensions, but pgvector doesnt have that application properties/yml config yet.
Environment Spring AI version 0.8.1-SNAPSHOT, Java version 21, PgVectorStore, OllamaEmbeddingClient, OllamaChatClient
Steps to reproduce add a list of Spring AI Documents then vectorStore.add(springAiDocuments);
Comment From: ricken07
Hi @kevintanhongann by default the dimension value is set to 1536, you can provide a value according to your needs, by default the dimension value is set to 1586, you can provide a value according to your needs, you must provide it in the constructor, here is an example : new PgVectorStore(jdbcTemplate, embeddingClient, 4096);
Comment From: kevintanhongann
@ricken07 thanks for the info.
Comment From: kevintanhongann
@ricken07 it doesn't seem like the constructor registers the dimensions. Still outputs the same error. tried both numbers 1536 and 4096.
Comment From: ricken07
@kevintanhongann you need to rebuild your database to take new dimensions value effect or modify it manually, embedding field vector dimensions in database.
Note: The dimensions are taken effect on first time to create vector_store table (by default)
Comment From: tzolov
@kevintanhongann
If you are using the PGVectorStore Boot Starter, then you can set the spring.ai.vectorstore.pgvector.dimensions=4096
property.
Also the PgVectorStore
implementation will retrieve the dimensions from the EmbeddingClient
if later are not explicitly provided.
But as mentioned above once, you have created your Vector_Store table the embedding column has fixed the dimensions size and you will like have to re-create your table to change it.
It is advised to do this manually but you can also try the spring.ai.vectorstore.pgvector.remove-existing-vector-store-table=true
property. Just don't forget to remove the property after!
I see that the PGVecorStore documentation is incomplete and i'm working to update it very soon. Let me know if this helps?
Comment From: tzolov
@kevintanhongann, fyi I've updated the pgvector store docs: https://docs.spring.io/spring-ai/reference/0.8-SNAPSHOT/api/vectordbs/pgvector.html
Comment From: kevintanhongann
@tzolov apparently HNSW cannot allow for a higher number than 1536 it seems. Tried adjusting that to 4096 and this is what I got
CREATE INDEX ON vector_store USING HNSW (embedding vector_cosine_ops);
Caused by: liquibase.exception.DatabaseException: ERROR: column cannot have more than 2000 dimensions for hnsw index [Failed SQL: (0) CREATE INDEX ON vector_store USING HNSW (embedding vector_cosine_ops)]
at liquibase.executor.jvm.JdbcExecutor$ExecuteStatementCallback.doInStatement(JdbcExecutor.java:468)
at liquibase.executor.jvm.JdbcExecutor.execute(JdbcExecutor.java:77)
at liquibase.executor.jvm.JdbcExecutor.execute(JdbcExecutor.java:177)
at liquibase.database.AbstractJdbcDatabase.execute(AbstractJdbcDatabase.java:1291)
at liquibase.database.AbstractJdbcDatabase.executeStatements(AbstractJdbcDatabase.java:1273)
at liquibase.changelog.ChangeSet.execute(ChangeSet.java:744)
... 47 common frames omitted
Caused by: org.postgresql.util.PSQLException: ERROR: column cannot have more than 2000 dimensions for hnsw index
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2725)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2412)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:371)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:502)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:419)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:341)
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:326)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:302)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:297)
at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:94)
at com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java)
at liquibase.executor.jvm.JdbcExecutor$ExecuteStatementCallback.doInStatement(JdbcExecutor.java:462)
... 52 common frames omitted
Can I use another kind of vector index?
Comment From: kevintanhongann
@tzolov
Also, somebody made a mistake in PGVectorStore class for uuid_generate_v4(). Someone must have spaced or tabbed by accident.
Comment From: kevintanhongann
Created a feature request https://github.com/spring-projects/spring-ai/issues/385
Comment From: tzolov
@kevintanhongann as you can see this is PGVector not Spring AI issue.
There is a PGVector open issue on the subject: Increase max vectors dimension limit for index #461
So the feature request should go there I guess.
Regarding
Also, somebody made a mistake in PGVectorStore class for uuid_generate_v4(). Someone must have spaced or tabbed by accident.
I see that the `uuid_generate_v4 ()` has an extras space, but this doesn't seem to matter as our Integration tests work as expected:
postgres=# \dt
List of relations
Schema | Name | Type | Owner
--------+--------------+-------+----------
public | vector_store | table | postgres
(1 row)
postgres=# \d vector_store
Table "public.vector_store"
Column | Type | Collation | Nullable | Default
-----------+--------------+-----------+----------+--------------------
id | uuid | | not null | uuid_generate_v4()
content | text | | |
metadata | json | | |
embedding | vector(1536) | | |
Indexes:
"vector_store_pkey" PRIMARY KEY, btree (id)
"vector_store_embedding_idx" hnsw (embedding vector_cosine_ops)
I will remove the extra space, though this apparently is not an issue.
Comment From: tzolov
@kevintanhongann , If you don't want to change the Vector Store, one possible alternative is to disable the OllamaEmbeddingClient
(while still using OllamaChatClient
) and opt for one of the other EmbeddingClients that have dimensions compatible with PGVector's limitations.
For example you can:
- disable the Ollama Embedding Client with the spring.ai.ollama.embedding.enabled=false
- add the local Transformers Embedding boot starter.
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-transformers-spring-boot-starter</artifactId>
</dependency>
Mind that if you add some of the boot starters that configure both chat-client and embedding-clients (like OpenAI, Auzre OpenAI, ...) you have to disable the chat client counterpart using the corresponding properties.
For example for OpenAI you have to set spring.ai.openai.chat.enabled=false
, which will live on only the OpenAIEmbeddingClient
.
Check the ref docs for the enable/disable property of other Chat Clients.
Hope this viable alternative?
Comment From: kevintanhongann
@tzolov I will try that out ASAP. Thanks for the pointers.
Comment From: ThomasVitale
The documentation for Spring AI PGVector has been updated to include the limitation of the HNSW indexes: https://docs.spring.io/spring-ai/reference/api/vectordbs/pgvector.html#_prerequisites
@kevintanhongann Can this issue be closed?
Comment From: kevintanhongann
Yeah sure.