Spring Graceful shutdown does not cancel @Scheduled tasks

Steps to reproduce: 1. Create a minimal spring-boot 3.2.2 project 2. Add @EnableScheduling to application 3. Define a taskScheduler bean of type ThreadPoolTaskScheduler 4. Create a method annotated with @Scheduled(fixedRate=1000) 5. Have a long running process in that method 6. In application.properties set server.shutdown=graceful and spring.lifecycle.timeout-per-shutdown-phase=5s

If I understand the docs correctly, the long running process should be canceled immediately and the task scheduler should be destroyed.

However, when signaling the application to shutdown, the long running process is not aborted immediately. Instead, I get an error message after 5 seconds: Failed to shut down 1 bean with phase value 2147483647 within timeout of 5000ms: [taskScheduler]

If I configure the taskScheduler with

taskScheduler.setWaitForTasksToCompleteOnShutdown(true);
taskScheduler.setAwaitTerminationMillis(0);

the task is canceled immediately.

Minimal example

    @Bean
    public TaskScheduler taskScheduler() {
        var taskScheduler = new ThreadPoolTaskScheduler() {
            @Override
            public void destroy() {
                log.info("taskScheduler Destroy");
                super.destroy();
            }
        };
        taskScheduler.setPoolSize(10);
        taskScheduler.setWaitForTasksToCompleteOnShutdown(false); // this doesn't result in task cancelation.

//        taskScheduler.setWaitForTasksToCompleteOnShutdown(true);
//        taskScheduler.setAwaitTerminationMillis(0);  // this results result in immediate task cancelation

      return taskScheduler;
    }

    @Scheduled(fixedRate = 1000)
    public void scheduled() {
        while (true) {
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                throw new RuntimeException(e);
            }
        }
    }

    @PreDestroy
    void predestroy() {
        log.info("predestroy");
    }

The TaskScheduler's destroy method and the predestroy() method are not called until after the 5 second timeout. If I configure the taskScheduler with setWaitForTasksToCompleteOnShutdown(true) and taskScheduler.setAwaitTerminationMillis(0), these methods are called immediately.

Is there a misunderstanding on my part, an error in the docs, or a bug?

Comment From: jhoeller

This is a surprisingly nuanced topic given all the input and feedback we had on this over the years.

A key idea behind a graceful shutdown is to let existing tasks complete as far as possible, concurrently in case of multiple executors/schedulers. This was explicitly requested for scheduled tasks (#31019) even before the lifecycle revision in 6.1, and after the lifecycle revision there is dedicated support for such a mode of shutdown now.

From that perspective, the behavior that you are experiencing is by design: For a graceful shutdown, we cancel recurring tasks so that further triggers do not fire anymore but let running tasks complete concurrently within the managed stop phase. If your task takes longer than that to complete, could you try to redesign it for shorter but more frequent triggering possibly? Or otherwise, just set a short enough lifecycle timeout and let it run into that info-level log message (which is not meant to be an error - maybe we should avoid the "failed" term there), followed by an interrupt on remaining tasks for a hard shutdown. No need to set any extra flags for this, you could just rely on the default arrangement there and set a custom lifecycle timeout.

The (old) waitForTasksToCompleteOnShutdown flag changes that behavior, effectively bypassing the concurrent managed stop phase in favor of awaiting a serial shutdown in each executor's destroy method (the common pre-6.1 behavior), potentially taking significant amounts of serial time in case of multiple executors/schedulers (depending on the await-termination setting). Note that this does not actually interrupt running tasks: With a zero-second wait period, it simply lets the JVM end, hard-stopping any remaining threads.

The (new) acceptTasksAfterContextClose flag lets you opt out of the concurrent managed stop phase as well but with a default hard interrupt for remaining tasks on shutdown. So for your desired immediate interrupt-on-shutdown behavior, you should actually set that flag instead of waitForTasksToCompleteOnShutdown. That way you'll get an interrupt on the blocked threads before the JVM shuts down, letting them end in an orderly fashion.

All things considered, I actually recommend the default shutdown behavior with a custom lifecycle timeout, possibly even shorter than 5s. We can revise the wording of that log message if that's the main irritation, e.g. "Shutdown phase 2147483647 ends with 1 bean still running after timeout of 5000ms: [taskScheduler]".

Comment From: MelvinFrohike

Thanks for the detailed response. I See now how to handle my use-case.

Your suggested change of log output is already helpful with clearing up the confusion. However, I find this to be not enough, as to me, both the API and documentation are confusing. IMO, it is not intuitive too have these calls result in a hard shutdown:

 taskScheduler.setAcceptTasksAfterContextClose(true);
 taskScheduler.setAwaitTerminationMillis(0);

Neither the names nor the documentation show that these two methods are in any way related. The first call in particular does not seem to have any effect on already running tasks.

In contrast, the setWaitForTasksToCompleteOnShutdown(false) method seems to result in immediate cancelation (implying to me an awaitTerminationMillis setting of 0).

While changing the API to be more intuitive might be tricky due to backwards compatibility, I would suggest clearing up the documentation of these methods.

Thanks again.

Comment From: MelvinFrohike

I've spoken too soon about knowing how to handle my use-case.

For a bit more context, I have a project with a graceful shutdown so that active requests are still completed (within the timelimit). I also have a series of tasks running, some of them via @Scheduled. I need to kill only one of these scheduled tasks immediately without waiting for the rest of the application to shutdown gracefully.

I've tried to create two taskSchedulers: one "normal" one and one with these lines:

 taskScheduler.setAcceptTasksAfterContextClose(true);
 taskScheduler.setAwaitTerminationMillis(0);

I use the latter one in the scheduled task that should be canceled immediately.

When shutting down the application and when no other task is running, or request is being completed, the task is canceled immediately, just as I need it to.

However, when either another scheduled task is running (with the "normal" taskScheduler) or a long running request is being completed, my special task is only being canceled when the other task or request is completed or times out.

Thus it seems to me that setAcceptTasksAfterContextClose does not affect cancelation of its tasks when there are other tasks. I've also tried to use taskScheduler.setWaitForTasksToCompleteOnShutdown(true) with the same effect.

How can I get one taskScheduler to cancel its tasks immediately while still retaining the graceful shutdown for other taskSchedulers and endpoints?

I feel this question may no longer be appropriate in an issue and should move to a discussion, but I am not sure the described behavior is intended.

Comment From: jhoeller

Thanks for sharing your scenario there, this is useful insight. All of this input is useful for revising our documentation there.

Some of those configuration options have legacy behind them. We try to keep them intact for backwards-compatible behavior in existing applications and also for enforcing pre-6.1 behavior in new setups if necessary. The name of the setting often reflects the original purpose but the overall semantics are not very obvious indeed. Also, please note that those setter methods only affect the local TaskScheduler instance; other TaskScheduler instances operate independently according to their own configuration. If certain tasks go through graceful stopping on one scheduler, that lifecycle step happens before any beans - including other schedulers - reach their destroy step; that's a consequence of the unified lifecycle model.

As for your special task, you could try to specifically react to a ContextClosedEvent in your endpoint implementation. Or we could provide an arrangement for immediately interrupting tasks at ContextClosedEvent time in ThreadPoolTaskScheduler, calling ExecutorService.shutdownNow (the only way to interrupt tasks within an ExecutorService) at that time already. This would happen immediately even next to other TaskSchedulers with graceful shutdown setups then.