closes gh-33

Notes: * The image Testcontainers is using has 8.8GB and takes a long time to load. * Load the huggingface transformer model for each test. It also takes time to load the model for the first time. * I chose distilbert-base-uncased for the default model as it is used in the PostgresML doc but not sure if it is the right one.

Comment From: tzolov

Thanks @making , Later today I will review the PR more thoughtfully. In the meantime can you also make sure that the PostgresMlEmbeddingClient implements the EmbeddingClient#dimensions()? The default implementation is inefficient and costly. Check the OpenAiEmbeddingClient#dimensions() implementation as a reference.

Comment From: making

I postponed implementing dimensions because it was difficult to decide which models were worth listing. Also, the hugging face model is being used, and I thought it would be better to locate it in a more general location than postgresml.

Comment From: tzolov

@making * If the hugging face models have documented dimensions per model then (in same PR) you can add new lines to the embedding-model-dimensions.properties. The format is model-name=dimensions-size per line. * Even if the model dimensions are unknown you still need to implement the caching as done in the OpenAiEmbeddingClient#dimensions(). As you can see the AtomicInteger to cache the dimension obtained by the EmbeddingUtil.dimensinos() utility. Later consults the embedding-model-dimensions.properties for known sizes and fallbacks to calling the embedding with dummy text and counts the size of the returned dimensions.

Comment From: making

@tzolov Updated

Comment From: tzolov

@making, when i try to run the test from the command line I get an error that looks like some fixture resources are being created twice?

./mvnw clean install -Pintegration-tests -pl embedding-clients/spring-ai-postgresml-embedding-client
[ERROR] org.springframework.ai.embedding.PostgresMlEmbeddingClientIT.embedWithDifferentModel -- Time elapsed: 1.240 s <<< ERROR!
org.springframework.dao.DuplicateKeyException: 
StatementCallback; SQL [CREATE EXTENSION IF NOT EXISTS pgml]; ERROR: duplicate key value violates unique constraint "pg_namespace_nspname_index"
  Detail: Key (nspname)=(pgml) already exists.
        at org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:103)
        at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:70)
        at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:79)
        at org.springframework.jdbc.core.JdbcTemplate.translateException(JdbcTemplate.java:1580)
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:398)
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:436)
        at org.springframework.ai.embedding.PostgresMlEmbeddingClient.afterPropertiesSet(PostgresMlEmbeddingClient.java:175)
        at org.springframework.ai.embedding.PostgresMlEmbeddingClientIT.embedWithDifferentModel(PostgresMlEmbeddingClientIT.java:76)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
        at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
        at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
        at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
        at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
        at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
        at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
        at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
        at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
        at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
        at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
        at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
        at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
        at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
        at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217)
        at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
        at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213)
        at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138)
        at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
        at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
        at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
        at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
        at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
        at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
        at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
        at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
        at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
        at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
        at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
        at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
        at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
        at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
        at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
        at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
        at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90)
        at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55)
        at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102)
        at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54)
        at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
        at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
        at org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
        at org.apache.maven.surefire.junitplatform.LazyLauncher.execute(LazyLauncher.java:56)
        at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.execute(JUnitPlatformProvider.java:184)
        at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:148)
        at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:122)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
        at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
        at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "pg_namespace_nspname_index"
  Detail: Key (nspname)=(pgml) already exists.
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2713)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2401)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:368)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:498)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:415)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:335)
        at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:321)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:297)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:292)
        at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:94)
        at com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java)
        at org.springframework.jdbc.core.JdbcTemplate$1ExecuteStatementCallback.doInStatement(JdbcTemplate.java:427)
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:383)
        ... 74 more

Comment From: making

Hmm, that's weird. It's working in my environment.

$ ./mvnw -V clean install -Pintegration-tests -pl embedding-clients/spring-ai-postgresml-embedding-client
Apache Maven 3.8.6 (84538c9988a25aec085021c365c560670ad80f63)
Maven home: /Users/tmaki/.m2/wrapper/dists/apache-maven-3.8.6-bin/67568434/apache-maven-3.8.6
Java version: 17.0.7, vendor: Oracle Corporation, runtime: /opt/graalvm/graalvm-jdk-17.0.7+8.1/Contents/Home
Default locale: ja_JP, platform encoding: UTF-8
OS name: "mac os x", version: "13.5.2", arch: "aarch64", family: "mac"
[INFO] Scanning for projects...
[WARNING] 
[WARNING] Some problems were encountered while building the effective model for org.springframework.experimental.ai:spring-ai-postgresml-embedding-client:jar:0.2.0-SNAPSHOT
[WARNING] The expression ${parent.version} is deprecated. Please use ${project.parent.version} instead.
[WARNING] 
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING] 
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING] 
[INFO] 
[INFO] --< org.springframework.experimental.ai:spring-ai-postgresml-embedding-client >--
[INFO] Building Spring AI Embedding Client - PostgresML 0.2.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ spring-ai-postgresml-embedding-client ---
[INFO] Deleting /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/target
[INFO] 
[INFO] --- flatten-maven-plugin:1.5.0:clean (clean) @ spring-ai-postgresml-embedding-client ---
[INFO] Deleting /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/.flattened-pom.xml
[INFO] 
[INFO] --- spring-javaformat-maven-plugin:0.0.39:validate (default) @ spring-ai-postgresml-embedding-client ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ spring-ai-postgresml-embedding-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/src/main/resources
[INFO] 
[INFO] --- flatten-maven-plugin:1.5.0:flatten (flatten) @ spring-ai-postgresml-embedding-client ---
[INFO] Generating flattened POM of project org.springframework.experimental.ai:spring-ai-postgresml-embedding-client:jar:0.2.0-SNAPSHOT...
[WARNING] The expression ${parent.version} is deprecated. Please use ${project.parent.version} instead.
[INFO] 
[INFO] --- maven-compiler-plugin:3.11.0:compile (default-compile) @ spring-ai-postgresml-embedding-client ---
[INFO] Changes detected - recompiling the module! :source
[INFO] Compiling 2 source files with javac [debug release 17] to target/classes
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ spring-ai-postgresml-embedding-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/src/test/resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.11.0:testCompile (default-testCompile) @ spring-ai-postgresml-embedding-client ---
[INFO] Changes detected - recompiling the module! :dependency
[INFO] Compiling 1 source file with javac [debug release 17] to target/test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:3.1.2:test (default-test) @ spring-ai-postgresml-embedding-client ---
[INFO] 
[INFO] --- maven-jar-plugin:3.3.0:jar (default-jar) @ spring-ai-postgresml-embedding-client ---
[INFO] Building jar: /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/target/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-javadoc-plugin:3.5.0:jar (generate-javadocs) @ spring-ai-postgresml-embedding-client ---
[INFO] No previous run data found, generating javadoc.
[INFO] Building jar: /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/target/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT-javadoc.jar
[INFO] 
[INFO] >>> maven-javadoc-plugin:3.5.0:aggregate (generate-aggregate-javadocs) > compile @ spring-ai-postgresml-embedding-client >>>
[INFO] 
[INFO] --- spring-javaformat-maven-plugin:0.0.39:validate (default) @ spring-ai-postgresml-embedding-client ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ spring-ai-postgresml-embedding-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/src/main/resources
[INFO] 
[INFO] --- flatten-maven-plugin:1.5.0:flatten (flatten) @ spring-ai-postgresml-embedding-client ---
[INFO] Generating flattened POM of project org.springframework.experimental.ai:spring-ai-postgresml-embedding-client:jar:0.2.0-SNAPSHOT...
[INFO] 
[INFO] --- maven-compiler-plugin:3.11.0:compile (default-compile) @ spring-ai-postgresml-embedding-client ---
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] <<< maven-javadoc-plugin:3.5.0:aggregate (generate-aggregate-javadocs) < compile @ spring-ai-postgresml-embedding-client <<<
[INFO] 
[INFO] 
[INFO] --- maven-javadoc-plugin:3.5.0:aggregate (generate-aggregate-javadocs) @ spring-ai-postgresml-embedding-client ---
[INFO] Configuration changed, re-generating javadoc.
[INFO] 
[INFO] >>> maven-source-plugin:3.3.0:jar (generate-sources) > generate-sources @ spring-ai-postgresml-embedding-client >>>
[INFO] 
[INFO] --- spring-javaformat-maven-plugin:0.0.39:validate (default) @ spring-ai-postgresml-embedding-client ---
[INFO] 
[INFO] <<< maven-source-plugin:3.3.0:jar (generate-sources) < generate-sources @ spring-ai-postgresml-embedding-client <<<
[INFO] 
[INFO] 
[INFO] --- maven-source-plugin:3.3.0:jar (generate-sources) @ spring-ai-postgresml-embedding-client ---
[INFO] Building jar: /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/target/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT-sources.jar
[INFO] 
[INFO] --- maven-failsafe-plugin:3.1.2:integration-test (default) @ spring-ai-postgresml-embedding-client ---
[INFO] Using auto detected provider org.apache.maven.surefire.junitplatform.JUnitPlatformProvider
[INFO] 
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.springframework.ai.embedding.PostgresMlEmbeddingClientIT
21:22:02.849 [main] INFO org.testcontainers.utility.ImageNameSubstitutor -- Image name substitution will be performed by: DefaultImageNameSubstitutor (composite of 'ConfigurationFileImageNameSubstitutor' and 'PrefixingImageNameSubstitutor')
21:22:02.947 [main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy -- Loaded org.testcontainers.dockerclient.UnixSocketClientProviderStrategy from ~/.testcontainers.properties, will try it first
21:22:03.091 [main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy -- Found Docker environment with local Unix socket (unix:///var/run/docker.sock)
21:22:03.092 [main] INFO org.testcontainers.DockerClientFactory -- Docker host IP address is localhost
21:22:03.104 [main] INFO org.testcontainers.DockerClientFactory -- Connected to docker: 
  Server Version: 24.0.6
  API Version: 1.43
  Operating System: OrbStack
  Total Memory: 19970 MB
21:22:03.184 [main] INFO tc.testcontainers/ryuk:0.5.1 -- Creating container for image: testcontainers/ryuk:0.5.1
21:22:03.238 [main] INFO org.testcontainers.utility.RegistryAuthLocator -- Credential helper/store (docker-credential-osxkeychain) does not have credentials for https://index.docker.io/v1/
21:22:03.300 [main] INFO tc.testcontainers/ryuk:0.5.1 -- Container testcontainers/ryuk:0.5.1 is starting: c191737e29a451630e434a0b9e39d1b9da0e9789f14f6a7e01d4e28b6f9c44cb
21:22:03.524 [main] INFO tc.testcontainers/ryuk:0.5.1 -- Container testcontainers/ryuk:0.5.1 started in PT0.416021S
21:22:03.528 [main] INFO org.testcontainers.utility.RyukResourceReaper -- Ryuk started - will monitor and terminate Testcontainers containers on JVM exit
21:22:03.528 [main] INFO org.testcontainers.DockerClientFactory -- Checking the system...
21:22:03.529 [main] INFO org.testcontainers.DockerClientFactory -- ✔︎ Docker server version should be at least 1.6.0
21:22:03.551 [main] INFO tc.ghcr.io/postgresml/postgresml:2.7.3 -- Creating container for image: ghcr.io/postgresml/postgresml:2.7.3
21:22:03.614 [main] INFO tc.ghcr.io/postgresml/postgresml:2.7.3 -- Container ghcr.io/postgresml/postgresml:2.7.3 is starting: 8dca1b4dfab8a2b920d68bec5f58d1db85ae4275e2b19d662cac2270969974bc
21:22:06.683 [main] INFO tc.ghcr.io/postgresml/postgresml:2.7.3 -- Container ghcr.io/postgresml/postgresml:2.7.3 started in PT3.131817S
21:22:06.684 [main] INFO tc.ghcr.io/postgresml/postgresml:2.7.3 -- Container is started (JDBC URL: jdbc:postgresql://localhost:32906/postgresml?loggerLevel=OFF)

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::                (v3.1.2)

2023-10-04T21:22:06.878+09:00  INFO 15939 --- [           main] o.s.a.e.PostgresMlEmbeddingClientIT      : Starting PostgresMlEmbeddingClientIT using Java 17.0.7 with PID 15939 (started by tmaki in /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client)
2023-10-04T21:22:06.879+09:00  INFO 15939 --- [           main] o.s.a.e.PostgresMlEmbeddingClientIT      : No active profile set, falling back to 1 default profile: "default"
2023-10-04T21:22:07.143+09:00  INFO 15939 --- [           main] o.s.a.e.PostgresMlEmbeddingClientIT      : Started PostgresMlEmbeddingClientIT in 0.436 seconds (process running for 4.7)
2023-10-04T21:22:07.149+09:00  INFO 15939 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2023-10-04T21:22:07.294+09:00  INFO 15939 --- [           main] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Added connection org.postgresql.jdbc.PgConnection@2e952845
2023-10-04T21:22:07.295+09:00  INFO 15939 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Start completed.
Java HotSpot(TM) 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
2023-10-04T21:22:07.698+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL statement [CREATE EXTENSION IF NOT EXISTS pgml]
2023-10-04T21:22:07.701+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : SQLWarning ignored: SQL state '42710', error code '0', message [extension "pgml" already exists, skipping]
2023-10-04T21:22:07.703+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL query
2023-10-04T21:22:07.703+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL statement [SELECT pgml.embed(?, ?, ?::JSONB) AS embedding]
2023-10-04T21:22:07.705+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 1, parameter value [intfloat/e5-small], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:07.706+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 2, parameter value [Hello World!], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:07.706+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 3, parameter value [{}], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:21.819+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL statement [CREATE EXTENSION IF NOT EXISTS pgml]
2023-10-04T21:22:21.822+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : SQLWarning ignored: SQL state '42710', error code '0', message [extension "pgml" already exists, skipping]
2023-10-04T21:22:21.822+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL query
2023-10-04T21:22:21.822+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL statement [SELECT pgml.embed(?, ?, ?::JSONB) AS embedding]
2023-10-04T21:22:21.823+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 1, parameter value [distilbert-base-uncased], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:21.823+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 2, parameter value [Hello World!], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:21.823+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 3, parameter value [{}], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:32.242+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL statement [CREATE EXTENSION IF NOT EXISTS pgml]
2023-10-04T21:22:32.248+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : SQLWarning ignored: SQL state '42710', error code '0', message [extension "pgml" already exists, skipping]
2023-10-04T21:22:32.248+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL query
2023-10-04T21:22:32.248+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL statement [SELECT pgml.embed(?, ?, ?::JSONB) AS embedding]
2023-10-04T21:22:32.248+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 1, parameter value [distilbert-base-uncased], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:32.248+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 2, parameter value [Hello World!], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:32.248+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 3, parameter value [{"device":"cpu"}], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:32.355+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL statement [CREATE EXTENSION IF NOT EXISTS pgml]
2023-10-04T21:22:32.373+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : SQLWarning ignored: SQL state '42710', error code '0', message [extension "pgml" already exists, skipping]
2023-10-04T21:22:32.375+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL query
2023-10-04T21:22:32.375+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL statement [SELECT pgml.embed(?, ?, ?::JSONB) AS embedding]
2023-10-04T21:22:32.375+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 1, parameter value [distilbert-base-uncased], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:32.375+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 2, parameter value [Test String], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:32.375+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 3, parameter value [{}], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:32.526+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL statement [CREATE EXTENSION IF NOT EXISTS pgml]
2023-10-04T21:22:32.536+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : SQLWarning ignored: SQL state '42710', error code '0', message [extension "pgml" already exists, skipping]
2023-10-04T21:22:32.536+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL query
2023-10-04T21:22:32.536+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL statement
2023-10-04T21:22:32.978+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL statement [CREATE EXTENSION IF NOT EXISTS pgml]
2023-10-04T21:22:32.988+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : SQLWarning ignored: SQL state '42710', error code '0', message [extension "pgml" already exists, skipping]
2023-10-04T21:22:32.988+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL statement [CREATE EXTENSION IF NOT EXISTS vector]
2023-10-04T21:22:33.000+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL query
2023-10-04T21:22:33.000+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL statement
2023-10-04T21:22:33.269+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL statement [CREATE EXTENSION IF NOT EXISTS pgml]
2023-10-04T21:22:33.280+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : SQLWarning ignored: SQL state '42710', error code '0', message [extension "pgml" already exists, skipping]
2023-10-04T21:22:33.280+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL statement [CREATE EXTENSION IF NOT EXISTS vector]
2023-10-04T21:22:33.297+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL query
2023-10-04T21:22:33.297+09:00 DEBUG 15939 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL statement [SELECT pgml.embed(?, ?, ?::JSONB)::vector AS embedding]
2023-10-04T21:22:33.298+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 1, parameter value [distilbert-base-uncased], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:33.298+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 2, parameter value [Hello World!], value class [java.lang.String], SQL type unknown
2023-10-04T21:22:33.298+09:00 TRACE 15939 --- [           main] o.s.jdbc.core.StatementCreatorUtils      : Setting SQL statement parameter value: column index 3, parameter value [{}], value class [java.lang.String], SQL type unknown
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 31.75 s -- in org.springframework.ai.embedding.PostgresMlEmbeddingClientIT
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
[INFO] 
[INFO] --- maven-failsafe-plugin:3.1.2:verify (default) @ spring-ai-postgresml-embedding-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spring-ai-postgresml-embedding-client ---
[INFO] Installing /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/target/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT.jar to /Users/tmaki/.m2/repository/org/springframework/experimental/ai/spring-ai-postgresml-embedding-client/0.2.0-SNAPSHOT/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT.jar
[INFO] Installing /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/.flattened-pom.xml to /Users/tmaki/.m2/repository/org/springframework/experimental/ai/spring-ai-postgresml-embedding-client/0.2.0-SNAPSHOT/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT.pom
[INFO] Installing /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/target/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT-javadoc.jar to /Users/tmaki/.m2/repository/org/springframework/experimental/ai/spring-ai-postgresml-embedding-client/0.2.0-SNAPSHOT/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT-javadoc.jar
[INFO] Installing /Users/tmaki/git/spring-ai/embedding-clients/spring-ai-postgresml-embedding-client/target/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT-sources.jar to /Users/tmaki/.m2/repository/org/springframework/experimental/ai/spring-ai-postgresml-embedding-client/0.2.0-SNAPSHOT/spring-ai-postgresml-embedding-client-0.2.0-SNAPSHOT-sources.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  37.427 s
[INFO] Finished at: 2023-10-04T21:22:34+09:00
[INFO] ------------------------------------------------------------------------

Comment From: tzolov

Apparently there is some timing/dirty-state issue issue. I occasionally fails when running the test form the IDE but constantly fails when run on the command line. The following helped me to get around it:

    @AfterEach
    void dropPgmlExtension() {
        this.jdbcTemplate.execute("DROP EXTENSION IF EXISTS pgml");
    }

I would expect the CREATE EXTENSION IF NOT EXISTS pgml to detect the pgml and skip the operation instead of throwing an error.

Anyway, will disable the PostgresMlEmbeddingClientIT for automatic execution as it demands and excessive amount of memory. One will be able to run it locally. Perhaps we need to create new category of tests run only locally.

Comment From: tzolov

@making , will go ahed and merge the PR, but could you please submit, in another PR, a. short README.md (under the spring-ai-postgresml-embedding-client folder) to explain briefly the postgresml and how to use the PostgresMlEmbeddingClient?

Comment From: tzolov

Rebased, squashed and merged at: caa8cb2021c0bfca267e1d9d056de49fb0cfd703