Resolves #613, #1042

Comment From: timostark

Thanks, that PR is actually very similar to mine :-) .. Generally three remarks:

  • The streaming does not really work with that PR too, as the internals of Azure AI rest calls seem to make blocking calls.. Yes tokens are streamed, but they are streamed when the chat model finishs answering. I think that we have to go to the Async OpenAI builder instead of the "normal" one. With the Async the streaming actually has sense and the tokens are retrieved directly after generation. A good way to test that, is that you make a simple Prompt: "Please give me an abstract about the US geography - minimum 1000 words". You will notice a long delay and than all tokens will be sent. If you use the async lib it will be token by token (as expected).

  • I don't really like that (from Azure official) "hack": // Note: the first chat completions can be ignored when using Azure OpenAI // servie which is a known service bug. .skip(1) As far as I've seen the only real effect of the first message is, that the getCHoices return value is null. As soon as Azure fixes their API the skip(1) will be dangerous and we will miss stuff. If you simply filter out stuff where getCHoices return null, it wil be safer..

  • As long as the JSON parsing is corrputed in azure ( https://github.com/Azure/azure-sdk-for-java/issues/41164 ) i would really tend to stay with 1.0.8.. The logs are really polluted in streaming scenario..Or is there a big reason for 1.0.10?

I will raise a PR as soon as this one is merged to main (i only have time on friday probably, but will do it than), adressing (1) and (3) of my suggestions.

Thanks for your work!

Comment From: tzolov

Thank you for the details @timostark

I've just merged the PR also replaying the skip with filter as you've suggested. Maybe will we will have to revert the version to beta8. But lets do this in a separate PR.

Feel free to submit your PR. Very interested to see how the Async client will pay.

Comment From: tzolov

rebased, squashed and merged at feb036d2f6534505ad0bdac12925f93a52737e73