Expected Behavior
Support reactive model with OpenAI, such as https://learn.microsoft.com/en-us/java/api/com.azure.ai.openai.openaiasyncclient?view=azure-java-preview
Current Behavior
Currently, OpenAI calls are blocking.
Context
We currently have a webflux application using Azure OpenAI; mixing async and sync is not ideal.
Comment From: tzolov
@liemng have you looked at the StreamingModelClient.java chat-client implementations?
OpenAiChatClient#stream((Prompt prompt): Flux<ChatResponse>
OllamaChatClient#stream((Prompt prompt): Flux<ChatResponse>
BedrockTitanChatClient#stream((Prompt prompt): Flux<ChatResponse>
BedrockLlama2ChatClient#stream((Prompt prompt): Flux<ChatResponse>
BedrockCohereChatClient#stream((Prompt prompt): Flux<ChatResponse>
BedrockAnthropicChatClient#stream((Prompt prompt): Flux<ChatResponse>
AzureOpenAiChatClient#stream((Prompt prompt): Flux<ChatResponse>
They implement the streaming chat response provided from the underlying models and return a reactive Flux<ChatResponse>
.
You can check the various test for an examples how to use it.
Comment From: markpollack
Closing as the classes listed support async functionality.
Comment From: liemng
@markpollack , Sorry for responding late... California outage :(
Streaming is different than non-blocking I/O. What I am referring to is non-streaming, non-blocking requests like: https://learn.microsoft.com/en-us/java/api/com.azure.ai.openai.openaiasyncclient?view=azure-java-preview#com-azure-ai-openai-openaiasyncclient-getchatcompletions(java-lang-string-com-azure-ai-openai-models-chatcompletionsoptions).
Is there plan to support non-streaming, non-blocking I/O?
Comment From: XhstormR
Yes, We need support for async APIs that can return Mono<ChatResponse>
.