Fix improper streaming behavior in data handling

Resolved an issue where data was not being streamed correctly due to improper use of the reduce function in the stream(Prompt prompt) method. This was causing data to be processed in a blocking manner, contrary to the expected reactive streaming behavior.

Changes made: - Removed the reduce operation which aggregated all data into one chunk, thus blocking the streaming. - Replaced the reduce with a direct handling of each individual item within the window to maintain continuous stream flow. - Added comprehensive tests to ensure streaming is performed as intended, and to verify the non-blocking behavior with multiple emitted items.

Resolves: #764

Comment From: bruno-oliveira

@joshlong @markpollack I have taken the liberty of opening a PR that attempts to address a problem I've been seeing with production usage where the streaming mode for the Azure OpenAI completions would, instead, return the entire response at once, which was not the desired behavior.

I will be happy to fix any potential issues and/or addressing any comments you might have!

Comment From: didalgolab

I could be wrong, but wasn't the removed concatMapIterable code with window.reduce part needed for function calling?

Also, I'm under impression that the only thing needed to fix streaming issue in Azure OpenAI model was in this code from AzureOpenAiChatModel:

            .windowUntil(chatCompletions -> {
                if (isFunctionCall.get() && chatCompletions.getChoices()
                    .get(0)
                    .getFinishReason() == CompletionsFinishReason.TOOL_CALLS) {
                    isFunctionCall.set(false);
                    return true;
                }
                return false;
            }, false)

to replace return false; with return !isFunctionCall.get();. At least this is what I've used in my code.

Comment From: bruno-oliveira

Seems fixed by: https://github.com/spring-projects/spring-ai/commit/178a607cf6fc65e302b7420fe50cb8dff7e2df2d