Bug description The OpenAiChatModel method *stream calls the chatCompletionStream inside the execute of the RetryTemplate*:

`public Flux<ChatResponse> stream(Prompt prompt) {
        OpenAiApi.ChatCompletionRequest request = this.createRequest(prompt, true);
        return (Flux)this.retryTemplate.execute((ctx) -> {
            Flux<OpenAiApi.ChatCompletionChunk> completionChunks = this.openAiApi.chatCompletionStream(request);
            ConcurrentHashMap<String, String> roleMap = new ConcurrentHashMap();
            return completionChunks.map((chunk) -> {
                return this.chunkToChatCompletion(chunk);
            }).switchMap((cc) -> {***...***`

However, the retry template does nothing, as the return object is a FluxMap: no error ever arises at this point

Environment I tried this out on the https://github.com/rd-1-2022/ai-openai-helloworld project

Steps to reproduce Run this test in the https://github.com/rd-1-2022/ai-openai-helloworld project. To try out the behavior in presence of an error, you may provide the wrong api key. @SpringBootTest(classes = OpenAiAutoConfiguration.class) public class OpenAiCallTest {

@Autowired
OpenAiChatModel openAiChatModel;

@Test
public void test(){
    System.out.println(openAiChatModel.stream(
            new Prompt(
                    "Generate the names of 5 famous pirates.",
                    OpenAiChatOptions.builder()
                            .withModel("gpt-3.5-turbo")
                            .withTemperature(0.4F)
                            .build()
            )).collectList().block());
}

}

Expected behavior There should be a working retry mechanism for calls to the WebClient

Comment From: csterwa

This may be related to #1193

Comment From: markpollack

This definitely needs more thought, I will remove it here (and elsewhere) until we have better strategy. Thanks for taking the time to file an issue.

Comment From: markpollack

this retry strategy was already removed, will be available in M3 release.