Spring-ai question about vllm - Nineya|java/go/python

does spring ai support the vllm +qwen? which starter can i use? and how to use spring ai connect the vllm ? tks.

Comment From: codespearhead

vLLM currently claims to have an OpenAI-compatible server, so you just have to use spring-ai-openai-spring-boot-starter and set the spring.ai.openai.base-url property [2] accordingly.

[1] https://docs.vllm.ai/en/stable/getting_started/quickstart.html#openai-compatible-server [2] https://docs.spring.io/spring-ai/reference/api/chat/openai-chat.html#_connection_properties

Comment From: zzllkk2003

vLLM currently claims to have an OpenAI-compatible server, so you just have to use spring-ai-openai-spring-boot-starter and set the spring.ai.openai.base-url property [2] accordingly.

[1] https://docs.vllm.ai/en/stable/getting_started/quickstart.html#openai-compatible-server [2] https://docs.spring.io/spring-ai/reference/api/chat/openai-chat.html#_connection_properties

It seems that a certain change has affected the structure of the request body,please see at MediaContent[1]. and using spring-ai-openai-spring-boot-starter 1.0.0.M1, it is impossible to request the vllm api

[1] https://github.com/spring-projects/spring-ai/commit/834d2d04879c080da208bdde7cda5aea7c48f585

1.0.0.M1- spring-ai-openai-spring-boot-starter, request

{
    "messages": [
        {
            "content": [
                {
                    "text": "Tell me a joke",
                    "type": "text"
                }
            ],
            "role": "user"
        }
    ],
    "model": "/data1/pretrained_models/Qwen/Qwen1.5-32B-Chat",
    "stream": false,
    "temperature": 0.7
}

--- vllm api request

{
    "max_tokens": 4000,
    "messages": [
        {
            "content": "Tell me a joke",
            "role": "user"
        }
    ],
    "model": "/data1/pretrained_models/Qwen/Qwen1.5-32B-Chat",
    "stream": false,
    "temperature": 0.7
}

so how can i use the spring-ai-openai-spring-boot-starter ?

Comment From: tzolov

@zzllkk2003 have you tried with 1.0.0-SNAPSHOT? We made a change recently to relax the OpenAI Chat Model implementation to support not fully API compliant clients like Groq and alike.

Comment From: asaikali

I had the same issue with 1.0.0.M1 upgrading to 1.0.0-SNAPSHOT fixed it.

Comment From: tzolov

@zzllkk2003, As @asaikali pointed out in 1.0.0-SNAPSHOT we applied a https://github.com/spring-projects/spring-ai/pull/863 fix to provide support for not fully compliant OpenAI API proxies such as Groq and vLLM.

Comment From: tzolov

Let me know how it goes, we will provide a dedicated documentation how to use vLLM with the help of the existing OpenAI Chat Model. Also when you set the base ur please DROP the /v1 section.

Comment From: qianwch

I had the same issue with 1.0.0.M1 upgrading to 1.0.0-SNAPSHOT fixed it.

I have tried 1.0.0-SNAPSHOT, but stil no luck with vllm. Do I need to add any option settings?

Comment From: renwayle

I had the same issue with 1.0.0.M2 . Do I need to add any option settings?

Comment From: renwayle

when？ provide support for fully compliant OpenAI API proxies such as Groq and vLLM.