Bug Report With the update to Spring AI 0.8.0 the ChatClient returns with the Ollama starter a 404 return code. It seems that instead of 'http://localhost:11434/api/generate' the endpoint 'http://localhost:11434/api/chat' is called. The request content posted is for a chat endpoint too. The OpenAI endpoint works fine.
Environment OS: Ubuntu. Spring AI with 'spring-ai-ollama-spring-boot-starter', 'spring-ai-transformers-spring-boot-starter', 'spring-ai-pgvector-store', 'spring-ai-tika-document-reader', 'spring-boot-starter-web'
Steps to reproduce -Create Ollama ChatClient -Call 'ChatClient.generate(prompt);'
Expected behavior The Ollama ChatClient should call the 'http://localhost:11434/api/generate' with the request content for the generate endpoint.
Minimal Complete Reproducible example
- Clone the AIDocumentLibraryChat project
- Build it with the 'useOllama' build property
- Start the Postgresql Vector store with the commands in 'runPostgresql.sh'
- Start the Ollama model with the commands in 'runOllama.sh'
- Start the application with the 'ollama' profile
- Open the UI in 'localhost:8080'
- Import 1 Pdf document
- Click on 'Search' and type in a question and hit 'Search'
Comment From: markpollack
Yikes! Thanks for reporting this. We don't yet have our CI environment running across all the model providers.
Comment From: tzolov
@Angular2Guy , i'm not sure i understand the issue?
At lower (model API level) the OllamaApi class implements both the /api/generate and the superior /api/chat endpoints.
The hight level OllamaChatClient as its name suggests deliberately leverages the /api/chat
endpoint. Unlike the /api/generate
the a /api/chat
supports messages conversation state!
The Ollama README provides brief description for both the low level API and the the OllamaChatClient.
Furthermore you can consult the the integration tests: in
* https://github.com/spring-projects/spring-ai/tree/main/models/spring-ai-ollama/src/test/java/org/springframework/ai/ollama and
* https://github.com/spring-projects/spring-ai/tree/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/ollama
Comment From: Angular2Guy
@tzolov Hello, I try to support in my project OpenAI endpoints and Ollama(stable-beluga, falcon) models. With OpenAI the ChatClient works fine. With Ollama and its models the ChatClient tries to access the endpoint '/api/chat'. That endpoint does currently not exist. In the documentation of Ollama are sample requests:
curl -X POST http://localhost:11434/api/generate -d '{ "model": "falcon", "prompt": "Why is the sky blue?" }'
That work.
I would like to use the ChatClient Interface to access the OpenAI endpoints and the Ollama endpoints based on the build configuration(included starters). That means the OllamaChatClient needs to use an endpoint that Ollama provides.
A configuration property could switch between the endpoints.
Comment From: tzolov
@Angular2Guy , the /api/chat
endpoint exist and is well documented in the Ollama documentation: https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion
Furthermore, as I've pointed out in the previous comment, our integration tests (running Ollama in Docker) successfully run the /api/chat
endpoint including the falcon
model. It runs fine in both synch and streaming modes. You can give it a try yourself run the OllamaChatAutoConfigurationIT
by replacing the orca-mini
with falcon
and remove the @Disabled annotation.
And lastly the /api/generation
is inferior compared to the /api/chat as it doesn't properly support message conversation (e.g. list of messages). Therefore it is not appropriate for usage by the ChatClient.
You still can access the /api/generated using The OllamaApi.
Comment From: Angular2Guy
@tzolov You are right. Ollama has added the endpoint. I have updated the Ollama Docker image and now it works. Your responses have pushed me in the right direction.