Bug description

  was aborted: ERROR: expected 1536 dimensions, not 4096  Call getNextException to see other errors in the batch.
        at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2413)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:579)
        at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:912)
        at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:936)
        at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1733)
        at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:127)
        at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java)
        at org.springframework.jdbc.core.JdbcTemplate.lambda$getPreparedStatementCallback$6(JdbcTemplate.java:1609)
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:658)
        ... 62 common frames omitted
Caused by: org.postgresql.util.PSQLException: ERROR: expected 1536 dimensions, not 4096
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2725)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2412)
        ... 70 common frames omitted

I notice that for Neo4j vector store can set the embedding dimensions, but pgvector doesnt have that application properties/yml config yet.

Environment Spring AI version 0.8.1-SNAPSHOT, Java version 21, PgVectorStore, OllamaEmbeddingClient, OllamaChatClient

Steps to reproduce add a list of Spring AI Documents then vectorStore.add(springAiDocuments);

Comment From: ricken07

Hi @kevintanhongann by default the dimension value is set to 1536, you can provide a value according to your needs, by default the dimension value is set to 1586, you can provide a value according to your needs, you must provide it in the constructor, here is an example : new PgVectorStore(jdbcTemplate, embeddingClient, 4096);

Comment From: kevintanhongann

@ricken07 thanks for the info.

Comment From: kevintanhongann

@ricken07 it doesn't seem like the constructor registers the dimensions. Still outputs the same error. tried both numbers 1536 and 4096.

Comment From: ricken07

@kevintanhongann you need to rebuild your database to take new dimensions value effect or modify it manually, embedding field vector dimensions in database. embedding

Note: The dimensions are taken effect on first time to create vector_store table (by default)

Comment From: tzolov

@kevintanhongann If you are using the PGVectorStore Boot Starter, then you can set the spring.ai.vectorstore.pgvector.dimensions=4096 property.

Also the PgVectorStore implementation will retrieve the dimensions from the EmbeddingClient if later are not explicitly provided.

But as mentioned above once, you have created your Vector_Store table the embedding column has fixed the dimensions size and you will like have to re-create your table to change it. It is advised to do this manually but you can also try the spring.ai.vectorstore.pgvector.remove-existing-vector-store-table=true property. Just don't forget to remove the property after!

I see that the PGVecorStore documentation is incomplete and i'm working to update it very soon. Let me know if this helps?

Comment From: tzolov

@kevintanhongann, fyi I've updated the pgvector store docs: https://docs.spring.io/spring-ai/reference/0.8-SNAPSHOT/api/vectordbs/pgvector.html

Comment From: kevintanhongann

@tzolov apparently HNSW cannot allow for a higher number than 1536 it seems. Tried adjusting that to 4096 and this is what I got

CREATE INDEX ON vector_store USING HNSW (embedding vector_cosine_ops);

Caused by: liquibase.exception.DatabaseException: ERROR: column cannot have more than 2000 dimensions for hnsw index [Failed SQL: (0) CREATE INDEX ON vector_store USING HNSW (embedding vector_cosine_ops)]
    at liquibase.executor.jvm.JdbcExecutor$ExecuteStatementCallback.doInStatement(JdbcExecutor.java:468)
    at liquibase.executor.jvm.JdbcExecutor.execute(JdbcExecutor.java:77)
    at liquibase.executor.jvm.JdbcExecutor.execute(JdbcExecutor.java:177)
    at liquibase.database.AbstractJdbcDatabase.execute(AbstractJdbcDatabase.java:1291)
    at liquibase.database.AbstractJdbcDatabase.executeStatements(AbstractJdbcDatabase.java:1273)
    at liquibase.changelog.ChangeSet.execute(ChangeSet.java:744)
    ... 47 common frames omitted
Caused by: org.postgresql.util.PSQLException: ERROR: column cannot have more than 2000 dimensions for hnsw index
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2725)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2412)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:371)
    at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:502)
    at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:419)
    at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:341)
    at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:326)
    at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:302)
    at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:297)
    at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:94)
    at com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java)
    at liquibase.executor.jvm.JdbcExecutor$ExecuteStatementCallback.doInStatement(JdbcExecutor.java:462)
    ... 52 common frames omitted

Can I use another kind of vector index?

Spring-ai pgvector vector store doesn't work properly when saving a list of Spring AI documents objects

Comment From: kevintanhongann

@tzolov

Spring-ai pgvector vector store doesn't work properly when saving a list of Spring AI documents objects

Also, somebody made a mistake in PGVectorStore class for uuid_generate_v4(). Someone must have spaced or tabbed by accident.

Comment From: kevintanhongann

Created a feature request https://github.com/spring-projects/spring-ai/issues/385

Comment From: tzolov

@kevintanhongann as you can see this is PGVector not Spring AI issue. There is a PGVector open issue on the subject: Increase max vectors dimension limit for index #461
So the feature request should go there I guess.

Regarding

Also, somebody made a mistake in PGVectorStore class for uuid_generate_v4(). Someone must have spaced or tabbed by accident.

I see that the `uuid_generate_v4 ()` has an extras space, but this doesn't seem to matter as our Integration tests work as expected: 

postgres=# \dt
            List of relations
 Schema |     Name     | Type  |  Owner
--------+--------------+-------+----------
 public | vector_store | table | postgres
(1 row)

postgres=# \d vector_store
                     Table "public.vector_store"
  Column   |     Type     | Collation | Nullable |      Default
-----------+--------------+-----------+----------+--------------------
 id        | uuid         |           | not null | uuid_generate_v4()
 content   | text         |           |          |
 metadata  | json         |           |          |
 embedding | vector(1536) |           |          |
Indexes:
    "vector_store_pkey" PRIMARY KEY, btree (id)
    "vector_store_embedding_idx" hnsw (embedding vector_cosine_ops)

I will remove the extra space, though this apparently is not an issue.

Comment From: tzolov

@kevintanhongann , If you don't want to change the Vector Store, one possible alternative is to disable the OllamaEmbeddingClient (while still using OllamaChatClient) and opt for one of the other EmbeddingClients that have dimensions compatible with PGVector's limitations.

For example you can: - disable the Ollama Embedding Client with the spring.ai.ollama.embedding.enabled=false - add the local Transformers Embedding boot starter.

<dependency>
   <groupId>org.springframework.ai</groupId>
   <artifactId>spring-ai-transformers-spring-boot-starter</artifactId>
</dependency>

Mind that if you add some of the boot starters that configure both chat-client and embedding-clients (like OpenAI, Auzre OpenAI, ...) you have to disable the chat client counterpart using the corresponding properties. For example for OpenAI you have to set spring.ai.openai.chat.enabled=false, which will live on only the OpenAIEmbeddingClient.
Check the ref docs for the enable/disable property of other Chat Clients.

Hope this viable alternative?

Comment From: kevintanhongann

@tzolov I will try that out ASAP. Thanks for the pointers.

Comment From: ThomasVitale

The documentation for Spring AI PGVector has been updated to include the limitation of the HNSW indexes: https://docs.spring.io/spring-ai/reference/api/vectordbs/pgvector.html#_prerequisites

Spring-ai pgvector vector store doesn't work properly when saving a list of Spring AI documents objects

@kevintanhongann Can this issue be closed?

Comment From: kevintanhongann

Yeah sure.