In Spring Boot version 3.2, the introduction of the Virtual Thread feature aimed to enhance concurrency and scalability. However, it has been observed that when deployed on ECS Fargate or Kubernetes environments where Nginx is running in the background, this feature seems to be causing API Gateway timeouts.

Expected Behavior: The backend API calls should respond within a reasonable time frame, avoiding API Gateway timeouts, even in environments where Nginx is utilized alongside Spring Boot 3.2 with Virtual Thread enabled.

Current Behavior: Under the described circumstances, the backend API calls are experiencing API Gateway timeouts, likely due to an interaction between the Virtual Thread feature in Spring Boot 3.2 and the background Nginx process.

Steps to Reproduce In ECS Fargate: 1. Create ALB 2. Create Listener rule 3. Create target group and map to listener rule 4. Create spring boot 3.2 (with api that will communicate to retrieve data from redis/mysql) with virtual thread on 5. Create docker image of spring boot 6. Create task definition and set path to your docker image 7. Create ecs and set target group arn created in step 3 8. Once deployed the health check will always fail because of virtual thread once virtual thread is disabled everything will work perfectly

Steps to Reproduce in Kubernetes 1. Setup minikube to local machine 2. Add nginx ingress to minikube 3. Create spring boot 3.2 (with api that will communicate to retrieve data from redis/mysql) 4. Create docker image with virtual thread feature on 5. Create deployment and map the image path (created in step 4) 6. Create internal service that will connect spring boot app created in step 5 7. Create ingress and map path to internal service created in step 6

Make backend API calls and observe the occurrence of API Gateway timeouts.

NOTE: It looks like if there is any I/O operation then virtual thread take place and internal ingress times out. Once I disabled virtual thread it works without any issue.

I am not sure if that is nginx issue or spring boot but that definitely have issue when deploying in ecs fargate.

P.S: I will create a small ready to deploy minikube deployments with virtual thread on and post the github link here

Comment From: bclozel

NOTE: It looks like if there is any I/O operation then virtual thread take place and internal ingress times out. Once I disabled virtual thread it works without any issue.

In both cases, is the same amount of traffic reaching the application (request/sec and concurrency)? I doubt that this is a Spring Boot issue, it's more likely a performance problem with your application. Once the web application accepts an higher number of concurrent requests, you will find other bottlenecks in your application that were previously hidden. In this case, the database connection pool and timeouts configuration are good candidates.

I'll close this issue for now as this type of problem is out of scope for the Spring Boot team. I would suggest profiling your application to check what's going on at runtime.

Comment From: naivefun

@asifbakht This is happening to me too. How did you solve the issue eventually? I added logs to filter and the response never returned. Definitely an inside spring boot issue. And it's non traffic server, not a performance issue either. Disable virtual thread everything turns out normal.

Comment From: asifbakht

@naivefun well I had no choice but to disable virtual thread as the project I was working on was for learning purpose. Creared issue here so that people should know the actual reason and get the idea why its happening. I thought this issue will be thoroughly looked at it but they came with different reason and closed it.

Comment From: bclozel

@naivefun @asifbakht We didn't get much feedback nor new reports about this. We can always investigate, but at this stage it's not obvious that this is a Spring Framework issue. Given the amount of infrastructure involved and the level of details, a scaling issue or a thread pinning problem with a 3rd party driver are still the most likely issues.

To make progress on this, we would need a way to reproduce this locally, without any load balancer involved or complex infrastructure. Providing a minimal setup can be hard so Java profiler data, thread dumps or Flight recorder sessions could be a good starting point.