Spring-ai When too much data is imported, timeouts may easily occur when executing the embedding model.

java.util.concurrent.TimeoutException: Channel response timed out after 60000 milliseconds.
    at com.azure.core.http.netty.implementation.AzureSdkHandler.responseTimedOut(AzureSdkHandler.java:202) ~[azure-core-http-netty-1.15.1.jar:1.15.1]
    at com.azure.core.http.netty.implementation.AzureSdkHandler.lambda$startResponseTracking$2(AzureSdkHandler.java:187) ~[azure-core-http-netty-1.15.1.jar:1.15.1]
    at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.runTask$$$capture(AbstractEventExecutor.java:173) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:166) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:566) ~[netty-transport-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
    at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]

2024-08-09T17:36:36.604+08:00 ERROR 4252 --- [nio-8083-exec-4] c.a.c.http.netty.NettyAsyncHttpClient    : java.util.concurrent.TimeoutException: Channel response timed out after 60000 milliseconds.

When I inserted 800 pieces of data, there was no data in the pg database and it was timed out directly.

Generally speaking, each model has its own timeout, but the timeout cannot be set endlessly. When I was using PgVectorStore.add, I saw that all the data embedding processing was done at once. When using this method, the user cannot grasp the data size.

Once a timeout occurs, all data cannot be inserted. So I think the processing logic here should be inserted in segments. Insert every 10 items to avoid problems like this.

Comment From: csterwa

@sobychacko do you have an approach for resolving the timeout question here? What is intended to be in the M2 & RC1 releases for this?

Comment From: impactCn

@csterwa If this code is not merged. I suggest that under external operations, the list should be processed by embedding it in chunks. For example.

        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-collections4</artifactId>
            <version>4.5.0-M2</version>
        </dependency>

int segmentNum = 100;
// all documents
List<Document>  documents;
List<List<Document>> segments = org.apache.commons.collections4.ListUtils.partition(documents, segmentNum);
for (List<Document> segment : segments) {
    vectorStore.accept(segment);
}

Comment From: sobychacko

@impactCn We recently introduced a batching strategy for embedding documents, which has already been implemented for the PG vector store. I pinged you on that pull request. When a list of documents is presented for embedding, we batch them based on the token threshold from the model, then embed each sub-batch outside of the loop that adds the embeddings. I think this should address the use case you are trying to solve. If not, please let us know, and we can make improvements.

Comment From: impactCn

@sobychacko Thank you very much, let me try it.

Comment From: markpollack

I don't think the embedding batching is going to solve the issue here as there is another batch size, eg.

``` public void doAdd(List documents) {

    int size = documents.size();
    this.embeddingModel.embed(documents, EmbeddingOptionsBuilder.builder().build(), this.batchingStrategy);

    this.jdbcTemplate.batchUpdate(
            "INSERT INTO " + getFullyQualifiedTableName()
                    + " (id, content, metadata, embedding) VALUES (?, ?, ?::jsonb, ?) " + "ON CONFLICT (id) DO "
                    + "UPDATE SET content = ? , metadata = ?::jsonb , embedding = ? ",
            new BatchPreparedStatementSetter() {
                @Override
                public void setValues(PreparedStatement ps, int i) throws SQLException {

                    var document = documents.get(i);

                                          // impl heere

                }

                @Override
                public int getBatchSize() {
                    return size;
                }
            });
}

...

The override public int getBatchSize() is the size of the document list, which say could be 100,000. The batch size should be configurable to enable multiple batch inserts to the pgvector database vs just one giant insert that could time out.