Actuator's Bug: the thread will hang indefinitely!
Version: spring-boot-actuator-2.6.9
Symptom: When accessing /actuator/health, the request is not being responded to and the thread is getting stuck.
Error log:
java.lang.AbstractMethodError: Receiver class org.redisson.spring.data.connection.RedissonReactiveRedisClusterConnection dos not define or inherit an implement of the resolved method 'reator.core.publisher.Mono clusterGetCliusterInfo()' of interface org.springframework.data.redis.connection.ReactiveRedisClusterConnetion.
at org.springframework.boot.actuate.redis.RedisReactiveHealthIndicator.getHealth(RedisRectiveHealthIndicator.java:67)
Reason: I use the error version of redisson-spring-data-2x that cause this problem. So i choose the appropriate version can resolve it.
Recommendation: Even if i use error version, the /actuator/health can give status Down but not hang the thread. Thanks!
Comment From: philwebb
Exceptions should trigger a down status. See AbstractReactiveHealthIndicator.handleFailure(...).
Can you please provide a sample application that show the problem.
Comment From: HuiWang1995
Exceptions should trigger a down status. See AbstractReactiveHealthIndicator.handleFailure(...).
Can you please provide a sample application that show the problem.
I hope it can trigger a down status, but not. I can‘t provide a sample application, because code repo I can't get outside.
what i can provide is error pom.xml
<dependency>
<groupId>org.redisson</groupId>
<artifactId>redisson-spring-boot-starter</artifactId>
<version>3.16.8</version>
<exclusions>
<exclusion>
<groupId>org.redisson</groupId>
<artifactId>redisson-spring-data-25</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.redisson</groupId>
<artifactId>redisson-spring-data-20</artifactId>
<version>3.16.8</version>
</dependency>
then i use this version
<dependency>
<groupId>org.redisson</groupId>
<artifactId>redisson-spring-boot-starter</artifactId>
<version>3.17.1</version>
</dependency>
<!-- this is not must include because redisson-spring-boot-starter will dependent it-->
<dependency>
<groupId>org.redisson</groupId>
<artifactId>redisson-spring-data-26</artifactId>
<version>3.17.1</version>
</dependency>
and spring boot version is 2.6.9
run the application of these dependencies will trigger it,when visit /actuator/health
Comment From: HuiWang1995
spring-data-redis : 2.6.5
spring-boot-starter-data-redis: 2.6.9
spring-boot-starter-cache: 2.6.9
may also needed.
and a redis config is needed
Comment From: wilkinsona
Unfortunately, we don't have time to piece things together from snippets of code. We don't need your actual application's code, just a minimal sample that reproduces the problem. That sample should include the necessary Redis configuration, perhaps in the form of a compose.yaml file for Docker Compose. You can share it with us by zipping it up and attaching it to this issue or pushing it to a separate repository on GitHub.
Also, please note that Spring Boot 2.6.x is no longer supported. Please upgrade to Spring Boot 2.7.x.
Comment From: HuiWang1995
Unfortunately, we don't have time to piece things together from snippets of code. We don't need your actual application's code, just a minimal sample that reproduces the problem. That sample should include the necessary Redis configuration, perhaps in the form of a compose.yaml file for Docker Compose. You can share it with us by zipping it up and attaching it to this issue or pushing it to a separate repository on GitHub.
Also, please note that Spring Boot 2.6.x is no longer supported. Please upgrade to Spring Boot 2.7.x.
Thank you. It's also will spend a lot of time to create a minimal sample.
Sorry, because I can't copy code from the inside net of company.
I will have a plan to upgrade to Spring Boot 2.7.x.
Comment From: wilkinsona
I think I've reproduced this by mocking ReactiveRedisClusterConnection to throw AbstractMethodError from clusterGetClusterInfo():
given(redisConnection.clusterGetClusterInfo()).willThrow(AbstractMethodError.class);
The problem occurs when an operator throws an exception that Reactor considers to be fatal:
Unless wrapped explicitly, such exceptions would always be thrown by operators instead of propagation through onError, potentially interrupting progress of Flux/Mono sequences. When they occur, the assumption is that Reactor is in an unrecoverable state (notably because the JVM itself might be in an unrecoverable state).
java.lang.AbstractMethodError is such an exception so the Mono<Health> sequence is interrupted and never completes.
We could change each operator in the sequence to catch and wrap such exceptions, but I don't think we should. Reactor already logs the failure, twice in fact:
10:23:48.932 [boundedElastic-1] WARN reactor.core.Exceptions -- throwIfFatal detected a jvm fatal exception, which is thrown and logged below:
java.lang.AbstractMethodError: Mocked error
at org.springframework.boot.actuate.data.redis.RedisReactiveHealthIndicator.getHealth(RedisReactiveHealthIndicator.java:68)
at org.springframework.boot.actuate.data.redis.RedisReactiveHealthIndicator.doHealthCheck(RedisReactiveHealthIndicator.java:62)
at org.springframework.boot.actuate.data.redis.RedisReactiveHealthIndicator.lambda$0(RedisReactiveHealthIndicator.java:52)
at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:132)
at reactor.core.publisher.FluxSubscribeOnCallable$CallableSubscribeOnSubscription.run(FluxSubscribeOnCallable.java:251)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
10:23:48.937 [boundedElastic-1] ERROR reactor.core.scheduler.Schedulers -- Scheduler worker in group main failed with an uncaught exception
java.lang.AbstractMethodError: Mocked error
at org.springframework.boot.actuate.data.redis.RedisReactiveHealthIndicator.getHealth(RedisReactiveHealthIndicator.java:68)
at org.springframework.boot.actuate.data.redis.RedisReactiveHealthIndicator.doHealthCheck(RedisReactiveHealthIndicator.java:62)
at org.springframework.boot.actuate.data.redis.RedisReactiveHealthIndicator.lambda$0(RedisReactiveHealthIndicator.java:52)
at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:132)
at reactor.core.publisher.FluxSubscribeOnCallable$CallableSubscribeOnSubscription.run(FluxSubscribeOnCallable.java:251)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
As such, I feel that the benefits of limping along are small and outweighed by the increase in complexity.
Comment From: HuiWang1995
Thank you for your answer. I'm glad that learning about Reactor from you.
It seems that the issue I encountered was caused by using an inappropriate version of Redisson that I copied from another project.
And our pipeline will visit /actuator/health to decide it's running successfully when deploying micro-service.
Fortunately, this issue only occurred during development stage and not in production, so it shouldn't be a major concern.
Once again, thank you for your feedback.