"We have a Spring-boot REST application running on 3 production machines. A recent update from Spring-boot 2.1.8 to 2.2.2 has shown an initial increase of CPU by at least double. This load then increases over time whereas the old version stays steady.
I have managed to narrow this down to 2.2.x as building with 2.1.11 is ok, but 2.2.0 shows the problem.
To give an idea of scale, the old version stays at around 6% regardless of load, whereas the new version starts at around 15% and gradually increases to over 100% after about 10 hours.
I can see the initial rise with an identical build, only changing the Spring-boot version. The application uses spring-boot-starter-web and spring-boot-starter-actuator."
Please see https://stackoverflow.com/questions/59879550/spring-boot-2-2-x-increased-cpu. One or two others have now chipped in with similar experiences.
We recently inadvertently deployed another application with 2.2.4 and 48 hours later the application became unresponsive with similar CPU growth. Reverting to 2.1.11 has cured the issue.
We have not yet tried 2.3.0 as it involves risking our production service.
Just wondering if anyone has any ideas as obviously we want to keep up with the latest releases?
Comment From: bclozel
Without more information about the CPU usage (like profiler data), we can’t really help you. Still this sounds familiar, see https://github.com/spring-projects/spring-framework/issues/25043#issuecomment-626395690
Any reason why you’re not using the latest 2.2.x and considering 2.3 right away? If 2.2.7 is not fixing this problem, please comment this issue with more data, especially profiler snapshots. Thanks!
Comment From: smithap
My Stackoverflow post has someone who tried 2.2.7 but is still seeing issues.
We are struggling reproduce this in our test environments, but I'll have another go this week.
I'll keep an eye out for what version of Spring is used.
Comment From: bclozel
Given the level of details shared in your SO question, it's likely that the remaining issue (for one of the services) that the other user is experiencing is a different one.
You could confirm the issue by plugging a VisualVM client at runtime when the issue happens, check the memory stats and take a snapshot of the allocated objects.
Comment From: bclozel
In the meantime, wayno93 responded to their previous comment and it seems Spring Framework 5.2.6 is fixing the issue.
Do you have another reason to believe this is not the case?
Comment From: spring-projects-issues
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.
Comment From: smithap
2.3.0 looking good after 72 hours in production. Will continue to monitor over the weekend.
Comment From: bclozel
I’m closing this issue as a duplicate of https://github.com/spring-projects/spring-framework/issues/25043 for now. Feel free to reopen with more data if this happens again.
Thanks!
Comment From: ade90036
I'm experiencing a similar issue.
I'm running spring boot 2.3.3.Release. It is running on an EKS cluster with 5 nodes. It is just to provide a REST API, so nothing fancy.
A customer reported an issue where when we was trying to authenticate with the application he was getting an error about connection been closed abruptly.
Looking at the logs it seems the load balancer has a maximum 60 seconds timeout and after that amount on inactivity the connection is shut.
I have then loaded the Kubernetes Dashboard and all the nodes were running in the region of 97-100% cpu utilization.
Looking at the network traffic i couldn't justify such high CPU usage for the number of logged in users and the number of request / bytes passing through the load-balancer.
I have been running the application a little bit more than a year using the 2.1.X and never see this issue before.
Unfortunatelly the Kubernetes dashboard only show CPU usage in 15 minutes window, so I have redeployed the application with micrometer and AWS Cloudwatch integration so i will be able to retrospective analyse metrics collected over 15 days period.
I will be able to report more information at later stage.
Not sure if this is any interest to you, but the only change we have introduced in the application is API versioning based on the "Media type versioning" for example: "application/vnd.company.app-v1+json".
Regards
ade
Comment From: bclozel
@ade90036 if you're suspecting a performance regression, please create a new issue with enough data to help us track down the problem: a project reproducing the behavior locally, profiler snapshots (showing where that CPU is being spent).
Thanks!