The list of options is long and the same list of options is used for chat and embedding. It seems like some of the options are used only for chat and some only for embedding. Refine the list of options for each case.

Comment From: alvaroblazmon

Hi @markpollack ,

I've reviewed this issue with the Ollama API documentation, and it seems that there is no difference between the chat and embedding options.

I checked these links:

https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-embeddings https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion

Both documentation links redirect me to the same Modelfile:

https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values

So, I'm wondering if you have more information where I could find the difference between the two options.

Thank you!

Comment From: tzolov

Hi @alvarozizou, i had the same observations while implementing the OllamaApi, OllamaChatClient and OllamaEmbeddingClient. Therefore i left the same options parameter for both chat and embedding. I don't think that parameters like temperature, topP and alike make sense for embedding. Also there are many GPU, Memory and other OS related configurations that are not even documented.

To be on the safe side, Spring-AI doesn't try to impose any opinions regarding the options. Just for consistency it adds an additional, artificial model parameter that is used to configure the request but is removed from the options passed to the Ollama API.

I guess any questions regarding the right options for the chat or the embedding endpoints should be directed to the Ollama GH instead.

Comment From: tzolov

As reference : https://github.com/ollama/ollama/issues/2349

Comment From: markpollack

Seems like there isn't a need to make a distinction, you just need to use the options required for each model and sort it out via documentation or experimentation.