Expected Behavior
Spring AI should support Groq with nothing but configuration changes.
Current Behavior
Although Groq's documentation states that it's compatible with OpenAI API, just changing the Spring AI configuration is not enough to successfully call Groq.
Context
I'd like to be able to call Groq's API using Spring AI. I tried with the following configuration:
spring.ai.openai.api-key=${GROQ_API_KEY}
spring.ai.openai.chat.options.model=llama3-70b
spring.ai.openai.chat.base-url=https://api.groq.com/openai
spring.ai.openai.chat.options.n=1
I get the following exception when I attempt to make a chat completion call:
org.springframework.web.client.RestClientException: Error while extracting response for type [org.springframework.ai.openai.api.OpenAiApi$ChatCompletion] and content type [application/json]
at org.springframework.web.client.DefaultRestClient.readWithMessageConverters(DefaultRestClient.java:236) ~[spring-web-6.1.6.jar:6.1.6]
at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.readBody(DefaultRestClient.java:667) ~[spring-web-6.1.6.jar:6.1.6]
at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.toEntityInternal(DefaultRestClient.java:637) ~[spring-web-6.1.6.jar:6.1.6]
at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.toEntity(DefaultRestClient.java:626) ~[spring-web-6.1.6.jar:6.1.6]
at org.springframework.ai.openai.api.OpenAiApi.chatCompletionEntity(OpenAiApi.java:751) ~[spring-ai-openai-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
at org.springframework.ai.openai.OpenAiChatClient.doChatCompletion(OpenAiChatClient.java:368) ~[spring-ai-openai-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
at org.springframework.ai.openai.OpenAiChatClient.doChatCompletion(OpenAiChatClient.java:75) ~[spring-ai-openai-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
at org.springframework.ai.model.function.AbstractFunctionCallSupport.callWithFunctionSupport(AbstractFunctionCallSupport.java:124) ~[spring-ai-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
at org.springframework.ai.openai.OpenAiChatClient.lambda$call$1(OpenAiChatClient.java:143) ~[spring-ai-openai-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:335) ~[spring-retry-2.0.5.jar:na]
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:211) ~[spring-retry-2.0.5.jar:na]
at org.springframework.ai.openai.OpenAiChatClient.call(OpenAiChatClient.java:141) ~[spring-ai-openai-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
...
Caused by: java.net.HttpRetryException: cannot retry due to server authentication, in streaming mode
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1796) ~[na:na]
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1599) ~[na:na]
at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:531) ~[na:na]
at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:307) ~[na:na]
at org.springframework.http.client.SimpleClientHttpResponse.getStatusCode(SimpleClientHttpResponse.java:55) ~[spring-web-6.1.6.jar:6.1.6]
at org.springframework.http.client.observation.DefaultClientRequestObservationConvention.outcome(DefaultClientRequestObservationConvention.java:155) ~[spring-web-6.1.6.jar:6.1.6]
at org.springframework.http.client.observation.DefaultClientRequestObservationConvention.getLowCardinalityKeyValues(DefaultClientRequestObservationConvention.java:98) ~[spring-web-6.1.6.jar:6.1.6]
at org.springframework.http.client.observation.DefaultClientRequestObservationConvention.getLowCardinalityKeyValues(DefaultClientRequestObservationConvention.java:41) ~[spring-web-6.1.6.jar:6.1.6]
at io.micrometer.observation.SimpleObservation.stop(SimpleObservation.java:174) ~[micrometer-observation-1.12.5.jar:1.12.5]
at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:499) ~[spring-web-6.1.6.jar:6.1.6]
at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.retrieve(DefaultRestClient.java:444) ~[spring-web-6.1.6.jar:6.1.6]
at org.springframework.ai.openai.api.OpenAiApi.chatCompletionEntity(OpenAiApi.java:750) ~[spring-ai-openai-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
Comment From: thesurlydev
Taking a quick look at a response from Groq using curl it appears the response object is a little different from the ChatCompletion
response. Here's an example response from Groq:
```
{
"id": "chatcmpl-ee2adcbd-3697-47e9-88b9-d4a5c4ba2032",
"object": "chat.completion",
"created": 1713808705,
"model": "mixtral-8x7b-32768",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Fast language models are important for a variety of reasons, including:\n\n1. Real-time applications: Fast language models can process and generate text in real-time, making them well-suited for applications such as chatbots, virtual assistants, and real-time translation.\n2. Large-scale processing: Fast language models can handle large volumes of text data quickly and efficiently, making them useful for tasks such as indexing and searching large corpora of text.\n3. Low-resource environments: Fast language models can run on devices with limited computational resources, such as smartphones and embedded devices, making natural language processing (NLP) capabilities accessible to a wider range of users and applications.\n4. Interactive exploration: Fast language models can be used to interactively explore and manipulate text data, allowing users to quickly and easily experiment with different text generation prompts and settings.\n5. Cost-effective: Fast language models can be less computationally intensive, which can result in lower costs for training and deployment, as well as reduced energy consumption.\n\nOverall, fast language models are important for enabling a wide range of NLP applications, from real-time chatbots to large-scale text processing, that can run efficiently and effectively in a variety of environments."
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 18,
"prompt_time": 0.006,
"completion_tokens": 267,
"completion_time": 0.472,
"total_tokens": 285,
"total_time": 0.478
},
"system_fingerprint": "fp_7b44c65f25",
"x_groq": {
"id": "req_01hw3fb11dft4t8ny3htzw8ga3"
}
}
````
Comment From: Mikl38400
Did you find a way to make it work ?
Comment From: thesurlydev
No. Although, I didn't look to see how easy it would be to override the response. Otherwise, it may be necessary to explicitly add support for Groq.
Comment From: tzolov
resolved by a6bed95358ab2cd5a3b3ec9a1a614a1b7ae610aa