Expected Behavior
Spring properties are exposed to control timeouts used by the clients.
Current Behavior
There's no documentation or properties to control things like connect, read, and write timeouts.
Context
The default timeout configuration results in some requests timing out. Here's an example:
org.springframework.web.client.ResourceAccessException: I/O error on POST request for "https://api.openai.com/v1/chat/completions": timeout
at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.createResourceAccessException(DefaultRestClient.java:549) ~[spring-web-6.1.3.jar:6.1.3]
at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:474) ~[spring-web-6.1.3.jar:6.1.3]
at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.retrieve(DefaultRestClient.java:439) ~[spring-web-6.1.3.jar:6.1.3]
at org.springframework.ai.openai.api.OpenAiApi.chatCompletionEntity(OpenAiApi.java:655) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at org.springframework.ai.openai.OpenAiChatClient.chatCompletionWithTools(OpenAiChatClient.java:337) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at org.springframework.ai.openai.OpenAiChatClient.lambda$call$1(OpenAiChatClient.java:151) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:335) ~[spring-retry-2.0.5.jar:na]
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:211) ~[spring-retry-2.0.5.jar:na]
at org.springframework.ai.openai.OpenAiChatClient.call(OpenAiChatClient.java:147) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
...
Caused by: java.net.SocketTimeoutException: timeout
at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.kt:675) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.kt:684) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http2.Http2Stream.takeHeaders(Http2Stream.kt:143) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http2.Http2ExchangeCodec.readResponseHeaders(Http2ExchangeCodec.kt:97) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:110) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:93) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201) ~[okhttp-4.12.0.jar:na]
at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154) ~[okhttp-4.12.0.jar:na]
at org.springframework.http.client.OkHttp3ClientHttpRequest.executeInternal(OkHttp3ClientHttpRequest.java:95) ~[spring-web-6.1.3.jar:6.1.3]
at org.springframework.http.client.AbstractStreamingClientHttpRequest.executeInternal(AbstractStreamingClientHttpRequest.java:70) ~[spring-web-6.1.3.jar:6.1.3]
at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:66) ~[spring-web-6.1.3.jar:6.1.3]
at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:468) ~[spring-web-6.1.3.jar:6.1.3]
Comment From: thingersoft
Quick temporary workaround:
ClientHttpRequestFactorySettings requestFactorySettings = new ClientHttpRequestFactorySettings(
connectTimeout , readTimeout , SslBundle.of(null));
ClientHttpRequestFactory requestFactory = ClientHttpRequestFactories.get(requestFactorySettings);
Builder clientBuilder = RestClient.builder().requestFactory(requestFactory);
OpenAiApi openAiApi = new OpenAiApi("https://api.openai.com", openAiToken, clientBuilder);
openAiChatClient = new OpenAiChatClient(openAiApi);
Comment From: making
you can configure timeout settings easier with RestClientCustomizer
.
@Bean
public RestClientCustomizer restClientCustomizer() {
return restClientBuilder -> restClientBuilder
.requestFactory(ClientHttpRequestFactories.get(ClientHttpRequestFactorySettings.DEFAULTS
.withConnectTimeout(Duration.ofSeconds(1))
.withReadTimeout(Duration.ofSeconds(2))));
}
This works not only for OpenAI but also for clients that use RestClient
under the hood.
Comment From: thingersoft
@making Acting on client at configuration level would affect the context-provided RestClientBuilder. So customization would apply wherever you injected a RestClientBuilder in your application and for every spring boot starter using it. That's why I proposed a programmatic approach as a workaround.
With bigger models like GPT-4 response to non-streamed requests often takes more than 1 minute to come. So I think that read timeout deserves a dedicated configuration property, avoiding the need for workarounds.
Anyway any comments or criticism is appreciated.
Comment From: making
@thingersoft Kind of agree. In the app side, timeouts can also be overridden via the builder. Timeouts are a concern not only for OpenAI but also across clients, so if the dedicated property is created at Spring AI level, I would expect that the design be common to all clients.
Comment From: thingersoft
@making Indeeed my first thought was to introduce the new property at an higher level.
Unfortunately the current codebase lacks infrastructure to define properties commons to all ModelClients. A few architectural changes was needed but being my first contribution to the project I didn't want to be too intrusive, so I falled back to an OpenAI dedicated property.
Anyway if a reviewer tells me it's ok I could open a new pull request with a more generic approach.
Comment From: markpollack
We are looking into a more generic approach, perhaps based on the generic model, perhaps based on aspects.
0.9 is when I thought would be a good point to tackle this problem. In the meantime, we can expose a few retry options for what we have now as an initial step. spring.ai.openai.chat.retry.XYZ
property, where the @ConfigurationProperty for retry would be reusable across multiple implementation eventually.
Also, in the meantime we can provide an additional constructor that takes a RetryTemplate
and in the autoconfiguraiton we would pass in a default implementation if one wasn't provided in the application context already.
Comment From: thingersoft
@markpollack
Hello Mark,
I may be missing something here since I didn't use Spring Retry before but I can't see how retry options and RetryTemplate
in particular relates to client read timeout configuration.
Comment From: thingersoft
@markpollack Hello Mark, I may be missing something here since I didn't use Spring Retry before but I can't see how retry options and
RetryTemplate
in particular relates to client read timeout configuration.
@markpollack I would be happy to contribute if you can tell me something about my quoted remark.
Comment From: markpollack
Getting back to these issues now. Just saying that retry is in the same family of concerns around connectivty.
Comment From: markpollack
Closing in favor of issue #512
I'd like to think that there is a general solution for this not related to AI. For example, an app is supposed to be able to talk to say 10 microservices, each with different timeouts.
Comment From: asoftake
you can configure timeout settings easier with
RestClientCustomizer
.
java @Bean public RestClientCustomizer restClientCustomizer() { return restClientBuilder -> restClientBuilder .requestFactory(ClientHttpRequestFactories.get(ClientHttpRequestFactorySettings.DEFAULTS .withConnectTimeout(Duration.ofSeconds(1)) .withReadTimeout(Duration.ofSeconds(2)))); }
This works not only for OpenAI but also for clients that use
RestClient
under the hood.
update Springboot3.4 remove okhttp3
Comment From: magware-dev
If you use Azure OpenAI then adding RestClientCustomizer
or Spring Boot 3.4 http.client.read-timeout
doesn't work because it uses the Azure Java OpenAIClient and not the rest client
I could make it work by providing my own OpenAIClientBuilder
bean that has a httpClient with a custom response timeout:
@Bean
fun openAiClient(connectionProperties: AzureOpenAiConnectionProperties): OpenAIClientBuilder {
val clientOptions = (ClientOptions()).setApplicationId("spring-ai")
val httpClientOptions = HttpClientOptions()
httpClientOptions.responseTimeout = Duration.ofMinutes(5)
return OpenAIClientBuilder()
.endpoint(connectionProperties.endpoint)
.credential(AzureKeyCredential(connectionProperties.apiKey))
.httpClient(HttpClient.createDefault(httpClientOptions))
.clientOptions(clientOptions)
}
NOTE: this only works if you use API key with a special endpoint (enterprise accounts I think) and I removed the custom headers stuff since I don't use it. You can copy the entire function from the autoconfiguration class and add the custom httpClient with the timeout or adjust this code to your config
I hope they add a customizer or something for that
This issue is the first organic google result for "spring ai timeout", so I comment here