Changed the filter template in PgVercorStore to replace JSONPath expressions with JSONB field access, using standard SQL operators such as '=' instead of '==', 'AND' instead of '&&', and 'OR' instead of '||'. This adjustment addresses compatibility issues with the 'IN' operator, which previously returned parse errors. Also updated PgVectorFilterExpressionConverter and corresponding tests to align with these changes.

  • Replaced metadata::jsonb @@ '$.key == "value"'::jsonpath with metadata::jsonb->>'key' = 'value'
  • Fixed parsing issues with metadata::jsonb @@ '$.key in ["value"]'::jsonpath by using JSONB field access.
  • Updated logical operators IN filter expressions to standard SQL syntax.

These changes improve the clarity, execution reliability, and compatibility of filter expressions in the database.

Comment From: tzolov

Thank you @dperezcabrera, looks like a nice improvement!

Could you please pull rebase to the upstream main (e.g. git pull -r upstream main), resolve any eventual conflicts and (force) merge to you PR?

Comment From: dperezcabrera

Thank you. I've rebased and merged the changes as requested!

Comment From: tzolov

@dperezcabrera have you run the integration tests for the pgvector and for the autoconfiguraiton? the ./mvnw clean install -pl vector-stores/spring-ai-pgvector-store -Pintegration-tests is failing form me

Comment From: dperezcabrera

I have executed them; however, since I don't have an API key for OpenAI, this has been the result: run: 4, Failures: 0, Errors: 0, Skipped: 4.

Can you please send me the error logs by email?

Comment From: tzolov

@dperezcabrera for example

  • searchWithFilters: COSINE_DISTANCE
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad SQL grammar [SELECT *, embedding [=] ? AS distance FROM vector_store WHERE embedding [=] ? [ ?  AND (metadata::jsonb-]]'country' = 'BG' AND metadata::jsonb-]]'year' = 2020)  ORDER BY distance LIMIT ? ]
 ```
* searchWithFilters: EUCLIDEAN_DISTANCE
 ```
 org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad SQL grammar [SELECT *, embedding [-] ? AS distance FROM vector_store WHERE embedding [-] ? [ ?  AND (metadata::jsonb-]]'country' = 'BG' AND metadata::jsonb-]]'year' = 2020)  ORDER BY distance LIMIT ? ]
 ```
 * searchWithFilters: NEGATIVE_INNER_PRODUCT

org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad SQL grammar [SELECT *, (1 + (embedding [#] ?)) AS distance FROM vector_store WHERE (1 + (embedding [#] ?)) [ ? AND (metadata::jsonb-]]'country' = 'BG' AND metadata::jsonb-]]'year' = 2020) ORDER BY distance LIMIT ? ]

```

Comment From: dperezcabrera

I see another problem, metadata::jsonb->>'year' must be (metadata->>'year')::INTEGER In protected void doKey(Key key, StringBuilder context) { I can't do this change.

Comment From: markpollack

Hi, this is a great contribution to improve the speed of pgvector, you mention In protected void doKey(Key key, StringBuilder context) { I can't do this change. what do you mean?

I can't easily share access keys, I would recommend getting an openai subscription, the cost is minimal.

Comment From: dperezcabrera

Here: (https://github.com/dperezcabrera/spring-ai/blob/bebbfceab6e22a6bdeda3c3e168aceaf52994abd/spring-ai-core/src/main/java/org/springframework/ai/vectorstore/filter/converter/PgVectorFilterExpressionConverter.java#L64C1-L76C1)

I need to know the type of the key. If fieldname is an integer, the method must use this template (metadata->>'fieldname')::INTEGER. If fieldname is a string, then: metadata::jsonb->>'fieldname'. If fieldname is a date, (metadata->>'fieldname')::DATE. This ensures the key is correctly interpreted in the database query. This will require a significant change in the common classes and may have side effects on other implementations, which need to be carefully considered before proposing it.

Now I have and access key, before I only had API keys for Microsoft's Azure, and they didn't work with the integrated tests.

Comment From: markpollack

bumping to RC1, apologies

Comment From: dperezcabrera

I think it is best to close this pull request, and address this functionality in the future. If you agree, close it.

Comment From: tzolov

Thank you for the effort @dperezcabrera. We can re-open it in the future if you have time to work on it.

Causing due to some unsolved issues (check the log above).