Problem

I'm trying to configure Graceful Shutdown for my application when it receives a SIGTERM event, but I'm getting InterruptedExceptions for Scheduled jobs where threads enter the wait status (due to a HTTP call or just a simple Thread.sleep() invocation).

Scenario

I created the example below by generating a project on Spring Initalizr with the following parameters: * Project: Maven * Language: Java * Spring Boot: 3.1.2 * Metadata: default values * Packaging: Jar * Java: 17 * No dependencies

From there I changed the DemoApplication class as following:

package com.example.demo;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.scheduling.annotation.EnableScheduling;

@EnableScheduling
@SpringBootApplication
public class DemoApplication {

    public static void main(String[] args) {
        System.out.println("Application started at PID: " + ProcessHandle.current().pid());
        SpringApplication.run(DemoApplication.class, args);
    }

}

Besides that, I created a Scheduled job as per the following:

package com.example.demo;

import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;

@Component
public class SampleJob {

    private int counter = 0;

    @Scheduled(fixedDelay = 1000L)
    public void simpleScheduledJob() throws Exception {
        counter++;
        System.out.println("Starting job " + counter);
        waitThatFails();
        //waitThatWorks();
        System.out.println("Finished job " + counter);
    }

    private void waitThatFails() throws Exception {
        Thread.sleep(5000L);
    }

    private void waitThatWorks() {
        for (int i = 0; i < 20_000; i++) {
            for (int h = 0; h < 1_000_000; h++) {
                // Make some wait time
            }
            if (i % 1_000 == 0) {
                System.out.print("|");
            }
        }
        System.out.println();
    }

}

And finally configured my application.properties as following:

server.shutdown=graceful
spring.lifecycle.timeout-per-shutdown-phase=60s
spring.task.execution.shutdown.await-termination=true
spring.task.execution.shutdown.await-termination-period=60s
spring.task.scheduling.shutdown.await-termination=true
spring.task.scheduling.shutdown.await-termination-period=60s

Expectations

When running the code above calling the method waitThatFails I expect NOT to get the following exception:

java.lang.InterruptedException: sleep interrupted
    at java.base/java.lang.Thread.sleep(Native Method) ~[na:na]
    at com.example.demo.SampleJob.waitThatFails(SampleJob.java:22) ~[classes/:na]
    at com.example.demo.SampleJob.simpleScheduledJob(SampleJob.java:15) ~[classes/:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[na:na]
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
    at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) ~[spring-context-6.0.11.jar:6.0.11]
    at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[spring-context-6.0.11.jar:6.0.11]
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[na:na]
    at java.base/java.util.concurrent.FutureTask.runAndReset$$$capture(FutureTask.java:305) ~[na:na]
    at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java) ~[na:na]
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[na:na]
    at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]

and I expect it to work the same way when I call instead the method waitThatWorks, meaning, the current running job is finished on the event of a kill -15 <PID> and no next jobs are scheduled to execute in sequence, making it safe to tear down the application.

Interesting to note that for standard Java applications this behavior can be achieved, please see the following example: https://github.com/schmittjoaopedro/schmittjoaopedro.github.io/blob/gh-pages/assets/other/SOQuestionGracefulShutdown.java

References

https://stackoverflow.com/questions/76868362/spring-graceful-shutdown-not-waiting-for-scheduled-tasks-with-thread-sleep

Comment From: jhoeller

Along the lines of your programmatic example, have you tried explicitly specifying the scheduler as a bean:

    @Bean
    public TaskScheduler taskScheduler() {
        ThreadPoolTaskScheduler scheduler = new ThreadPoolTaskScheduler();
        scheduler.setWaitForTasksToCompleteOnShutdown(true);
        scheduler.setAwaitTerminationSeconds(30);
        return scheduler;
    }

If that works, your problem might rather be in Spring Boot and its properties-configured default TaskScheduler.

Comment From: pf-joao-schmitt

Along the lines of your programmatic example, have you tried explicitly specifying the scheduler as a bean:

@Bean public TaskScheduler taskScheduler() { ThreadPoolTaskScheduler scheduler = new ThreadPoolTaskScheduler(); scheduler.setWaitForTasksToCompleteOnShutdown(true); scheduler.setAwaitTerminationSeconds(30); return scheduler; }

If that works, your problem might rather be in Spring Boot and its properties-configured default TaskScheduler.

Hi @jhoeller , yes I have, the same problem persists

Comment From: jhoeller

So you are calling taskExecutor.getThreadPoolExecutor().awaitTermination(30, TimeUnit.SECONDS) in your programmatic example, that should be equivalent to the effect of setAwaitTerminationSeconds(30). Any idea where the difference in runtime behavior comes from? You could put a breakpoint in ExecutorConfigurationSupport.shutdown() and step through it. Provided that your configuration is correctly picked up and not accidentally ignored, I am curious to find out about the actual difference.

You could also try to add a scheduler.setThreadNamePrefix(...) call there and then see which thread actually invokes your @Scheduled method, ruling out the accidental ignoring of your configured TaskScheduler.

Comment From: jhoeller

FYI there is a significant revision of the ThreadPoolTaskExecutor/Scheduler lifecycle capabilities in the upcoming 6.1: see #30831, #27090, #24497. This includes an early soft shutdown signal on ContextClosedEvent for the parallel graceful shutdown of multiple executors. The interruption behavior is still controlled by the same settings in 6.1, there is just additional orchestration around it.

Nevertheless, the configuration options above should work fine in 6.0.x as well.

Comment From: pf-joao-schmitt

Hi @jhoeller I took a look at the code and to try to spot the differences.

As far as I can say the ExecutorConfigurationSupport#awaitTerminationIfNecessary should do the work on shutdown. However I noticed by adding a breakpoint on ExecutorConfigurationSupport#shutdown that the InterruptedException even happens before the first call to the this.executor.shutdown (something earlier is affecting the threads).

By digging a bit further I could see that ScheduledAnnotationBeanPostProcessor#destroy is called before the ThreadPool is shutdown. As you can see this method iterates over all scheduledTasks and cancels each one by calling the task.cancel(), that consequently causes the InterruptedException.

Comment From: jhoeller

Thanks, that's very timely insight! Indeed, those cancel calls should rather use cancel(false) when participating in graceful shutdown scenarios, just preventing further scheduling but letting the existing tasks complete (unless the scheduler itself is configured to interrupt them). I'll narrow the purpose of this ticket for that cancel part, addressing it for the 6.0.12 release.

Comment From: pf-joao-schmitt

Thanks, looking forward for the fix to this issue!