Spring-ai Ollama 'ChatResponse response = chatClient.generate(prompt);' returns Httpcode 404.

Bug Report With the update to Spring AI 0.8.0 the ChatClient returns with the Ollama starter a 404 return code. It seems that instead of 'http://localhost:11434/api/generate' the endpoint 'http://localhost:11434/api/chat' is called. The request content posted is for a chat endpoint too. The OpenAI endpoint works fine.

Environment OS: Ubuntu. Spring AI with 'spring-ai-ollama-spring-boot-starter', 'spring-ai-transformers-spring-boot-starter', 'spring-ai-pgvector-store', 'spring-ai-tika-document-reader', 'spring-boot-starter-web'

Steps to reproduce -Create Ollama ChatClient -Call 'ChatClient.generate(prompt);'

Expected behavior The Ollama ChatClient should call the 'http://localhost:11434/api/generate' with the request content for the generate endpoint.

Minimal Complete Reproducible example

Clone the AIDocumentLibraryChat project
Build it with the 'useOllama' build property
Start the Postgresql Vector store with the commands in 'runPostgresql.sh'
Start the Ollama model with the commands in 'runOllama.sh'
Start the application with the 'ollama' profile
Open the UI in 'localhost:8080'
Import 1 Pdf document
Click on 'Search' and type in a question and hit 'Search'

Comment From: markpollack

Yikes! Thanks for reporting this. We don't yet have our CI environment running across all the model providers.

Comment From: tzolov

@Angular2Guy , i'm not sure i understand the issue? At lower (model API level) the OllamaApi class implements both the /api/generate and the superior /api/chat endpoints. The hight level OllamaChatClient as its name suggests deliberately leverages the /api/chat endpoint. Unlike the /api/generate the a /api/chat supports messages conversation state! The Ollama README provides brief description for both the low level API and the the OllamaChatClient. Furthermore you can consult the the integration tests: in * https://github.com/spring-projects/spring-ai/tree/main/models/spring-ai-ollama/src/test/java/org/springframework/ai/ollama and * https://github.com/spring-projects/spring-ai/tree/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/ollama

Comment From: Angular2Guy

@tzolov Hello, I try to support in my project OpenAI endpoints and Ollama(stable-beluga, falcon) models. With OpenAI the ChatClient works fine. With Ollama and its models the ChatClient tries to access the endpoint '/api/chat'. That endpoint does currently not exist. In the documentation of Ollama are sample requests:

curl -X POST http://localhost:11434/api/generate -d '{ "model": "falcon", "prompt": "Why is the sky blue?" }'

That work.

I would like to use the ChatClient Interface to access the OpenAI endpoints and the Ollama endpoints based on the build configuration(included starters). That means the OllamaChatClient needs to use an endpoint that Ollama provides.

A configuration property could switch between the endpoints.

Comment From: tzolov

@Angular2Guy , the /api/chat endpoint exist and is well documented in the Ollama documentation: https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion Furthermore, as I've pointed out in the previous comment, our integration tests (running Ollama in Docker) successfully run the /api/chat endpoint including the falcon model. It runs fine in both synch and streaming modes. You can give it a try yourself run the OllamaChatAutoConfigurationIT by replacing the orca-mini with falcon and remove the @Disabled annotation.

And lastly the /api/generation is inferior compared to the /api/chat as it doesn't properly support message conversation (e.g. list of messages). Therefore it is not appropriate for usage by the ChatClient. You still can access the /api/generated using The OllamaApi.

Comment From: Angular2Guy

@tzolov You are right. Ollama has added the endpoint. I have updated the Ollama Docker image and now it works. Your responses have pushed me in the right direction.