Bug description When using the Ollama service, if the browser is closed or the request is manually aborted during a chat request, the Ollama model continues executing tasks instead of properly interrupting the request. As a result, Ollama does not stop the service to free up performance.
Could this be due to the event-stream failing to propagate the cancellation signal from Flux to WebClient when the request is canceled?
The same issue occurs with embedding requests.
This issue is more noticeable on Ollama services with weaker performance, as the execution takes longer. If not interrupted in time, it will continue to occupy CPU or GPU resources, affecting subsequent requests.
Environment Spring AI 1.0.0-M6