Expected Behavior

Support reactive model with OpenAI, such as https://learn.microsoft.com/en-us/java/api/com.azure.ai.openai.openaiasyncclient?view=azure-java-preview

Current Behavior

Currently, OpenAI calls are blocking.

Context

We currently have a webflux application using Azure OpenAI; mixing async and sync is not ideal.

Comment From: tzolov

@liemng have you looked at the StreamingModelClient.java chat-client implementations?

  • OpenAiChatClient#stream((Prompt prompt): Flux<ChatResponse>
  • OllamaChatClient#stream((Prompt prompt): Flux<ChatResponse>
  • BedrockTitanChatClient#stream((Prompt prompt): Flux<ChatResponse>
  • BedrockLlama2ChatClient#stream((Prompt prompt): Flux<ChatResponse>
  • BedrockCohereChatClient#stream((Prompt prompt): Flux<ChatResponse>
  • BedrockAnthropicChatClient#stream((Prompt prompt): Flux<ChatResponse>
  • AzureOpenAiChatClient#stream((Prompt prompt): Flux<ChatResponse>

They implement the streaming chat response provided from the underlying models and return a reactive Flux<ChatResponse>.

You can check the various test for an examples how to use it.

Comment From: markpollack

Closing as the classes listed support async functionality.

Comment From: liemng

@markpollack , Sorry for responding late... California outage :(

Streaming is different than non-blocking I/O. What I am referring to is non-streaming, non-blocking requests like: https://learn.microsoft.com/en-us/java/api/com.azure.ai.openai.openaiasyncclient?view=azure-java-preview#com-azure-ai-openai-openaiasyncclient-getchatcompletions(java-lang-string-com-azure-ai-openai-models-chatcompletionsoptions).

Is there plan to support non-streaming, non-blocking I/O?

Comment From: XhstormR

Yes, We need support for async APIs that can return Mono<ChatResponse> .