Unlike Jetty, Tomcat, and Undertow, shutdown of an application that uses Reactor Netty is delayed by an active request. It appears to be delayed until the request completes. Looking at a thread dump during the delay, it appears to be caused by the destruction of ReactorResourceFactory:
"SpringContextShutdownHook" #16 prio=5 os_prio=31 tid=0x00007fda119d7800 nid=0x5803 waiting on condition [0x0000700008599000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007175081c0> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:87)
at reactor.core.publisher.Mono.block(Mono.java:1678)
at org.springframework.http.client.reactive.ReactorResourceFactory.destroy(ReactorResourceFactory.java:225)
at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:258)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:579)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:551)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:1091)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:512)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons(DefaultListableBeanFactory.java:1084)
at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1060)
at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1029)
at org.springframework.context.support.AbstractApplicationContext$1.run(AbstractApplicationContext.java:948)
- locked <0x00000005c00d33a0> (a java.lang.Object)
Comment From: wilkinsona
The fact that the delay appears to be indefinite (I have seen request handling with a 60 second sleep be allowed to complete) may be a Reactor Netty bug. The default quiet period is 2 seconds and the default timeout is 15 seconds. I think that should result in a 17 second delay before the connection's dropped.
Comment From: jhoglin
I mentioned this issue to @violetagg on reactor-netty's Gitter, and was informed about this test-case, this is to validate that the Netty indeed shuts down as expected. Maybe this can help someone to resolve the issue.
https://github.com/reactor/reactor-netty/blob/ca10139086abdeaf9fd9db7076b5511532e8326e/reactor-netty-http/src/test/java/reactor/netty/http/server/HttpServerTests.java#L1725
Comment From: philwebb
The ReactorResourceFactory code has moved on a bit, it looks like this version matches the stacktrace.
Comment From: wilkinsona
Here's a pure Reactor Netty test that reproduces the behavior we're seeing in Boot:
@Test
public void testGracefulShutdown() throws Exception {
CountDownLatch latch1 = new CountDownLatch(2);
CountDownLatch latch2 = new CountDownLatch(2);
LoopResources loop = LoopResources.create("testGracefulShutdown");
this.disposableServer = HttpServer.create().port(0).runOn(loop).doOnConnection(c -> {
c.onDispose().subscribe(null, null, latch2::countDown);
latch1.countDown();
})
// Register a channel group, when invoking disposeNow()
// the implementation will wait for the active requests to finish
.channelGroup(new DefaultChannelGroup(new DefaultEventExecutor()))
.route(r -> r
.get("/delay500",
(req, res) -> res.sendString(Mono.just("delay500").delayElement(Duration.ofMillis(500))))
.get("/cpuIntensive", (req, res) -> {
// Simulate some long-running CPU-intensive work
boolean stop = false;
while (!stop) {
}
return res.sendString(Mono.just("cpuIntensive"));
}))
.wiretap(true)
.bindNow(Duration.ofSeconds(30));
HttpClient client = HttpClient.create().remoteAddress(this.disposableServer::address).wiretap(true);
MonoProcessor<String> result = MonoProcessor.create();
Flux.just("/delay500", "/cpuIntensive")
.flatMap(s -> client.get().uri(s).responseContent().aggregate().asString())
.collect(Collectors.joining())
.subscribe(result);
assertThat(latch1.await(30, TimeUnit.SECONDS)).isTrue();
// Stop accepting incoming requests, wait at most 3s for the active requests to
// finish
try {
this.disposableServer.disposeNow();
}
catch (IllegalStateException ex) {
System.out.println(ex.getMessage());
// The socket couldn't be stopped, continue with shutdown
}
System.out.println("Disposing of the loop resources");
loop.disposeLater().block();
System.out.println("Disposal complete");
assertThat(latch2.await(30, TimeUnit.SECONDS)).isTrue();
}
Compared to the Reactor Netty test linked above, a key difference is that this test uses a while loop to simulate some CPU intensive work that takes longer to complete than the 15 second disposal timeout. This results in the block() on the Mono returns from loop.disposeLater() never returning.
@violetagg, is this a Reactor Netty bug or should we be doing something different in Boot to shut things down in this situation?
Comment From: violetagg
let me take a look
Comment From: violetagg
@wilkinsona Please create an issue as this seems to be a Reactor Netty bug.
Comment From: philwebb
I've opened https://github.com/reactor/reactor-netty/issues/3509