fix issue #1727 when using text-generation-inference models
I think the openapi.json needs to be fixed. you can find the response type is an array here
test models: microsoft/Phi-3-mini-4k-instruct mistralai/Mistral-7B-Instruct-v0.3 Qwen/Qwen2.5-Coder-32B-Instruct
microsoft/Phi-3-mini-4k-instruct
mistralai/Mistral-7B-Instruct-v0.3
Qwen/Qwen2.5-Coder-32B-Instruct
Comment From: jitokim
i found the response type is an array here
// wrap generation inside a Vec to match api-inference
Ok((headers, Json(vec![generation])).into_response())
Comment From: ilayaperumalg
Hi @jitokim, Thanks for the fix! LGTM, merged as 3c14fa63 after updating the ClientIT's prompt message to insist on providing strict JSON output so that the test assertion has more probability of passing.
Comment From: jitokim
@ilayaperumalg Hi. I was wondering how to compare Markdown format with the expected value, and you've solved it in a very smart way. I've learned a great approach from you. Thank you!
Comment From: ilayaperumalg
@jitokim Thank you for the kind words! It was the suggestion from @markpollack!