SpringBoot Actuator document is misleading about k8s startup probe

Actuator reference says:

If an application takes longer to start than the configured liveness period, Kubernetes mention the "startupProbe" as a possible solution. The "startupProbe" is not necessarily needed here as the "readinessProbe" fails until all startup tasks are done.

But the fact is "Liveness probes do not wait for readiness probes to succeed". That means if your application take a long time to start, k8s may kill it before its readinessProbe success — It will never be able to start successfully.

So startupProbe is really necessary.

Comment From: bclozel

If I understand correctly, the startupProbe is a way to have a "special case livenessProbe only at startup".

What we're trying to say here is that Spring Boot generally handles what's strictly necessary to get the application live and delays many initialization tasks (like ApplicationRunner instances) after that. The readinessProbe is marked as successful only when all those startup tasks are done. The application is technically live, just handling startup tasks and not receiving traffic until it's fully ready.

In many cases, long running startup tasks are executed after the application is marked as live, so a startupProbe is not strictly necessary.

"Liveness probes do not wait for readiness probes to succeed".

I think this bit means that if your application has a successful readinessProbe and a failed livenessProbe, your application is considered as broken and will be wiped. Spring Boot is perfectly in line with that and I don't think that this section of the documentation states otherwise.

We don't completely rule out startupProbes, we're merely saying that you might not need it. Of course, some applications are handling heavy startup tasks as part of bean lifecycle (not a best practice from my point of view). Doing so ties those tasks to the context refresh phase and thus the time to get the livenessProbe UP. In this case, a startupProbe is probably required if you don't want to extend the period check too much.

I'd be happy to improve the documentation - I'd rather not explain k8s internals in our reference documentation, but give general guidance to developers.

Comment From: ichenhe

What we're trying to say here is that Spring Boot generally handles what's strictly necessary to get the application live and delays many initialization tasks (like ApplicationRunner instances) after that.

Understand. What you're trying to say is that spring usually starts very quickly, those slow tasks will be delayed. But there's a use case:

I have some micro services (maybe 5 or more), I deploy them at once, then the server may be temporarily overloaded. In this case, the start will be slower than expected. And then, k8s will kill them.

"Liveness probes do not wait for readiness probes to succeed. If you want to wait before executing a liveness probe you should use initialDelaySeconds or a startupProbe.

I think you got it wrong. Now, I have copied this warning entirly. He clarified one thing:

Maybe many people have misunderstood readiness probes because of the word readiness. They (include me) believe that liveness will not be judged until it is ready. For programs that start slowly, this understanding is fatal.

So k8s wrote this warning.

But Actuator's document deepened my misunderstanding. Now I understand what he really means, but we'd better improve the description. @bclozel

Comment From: bclozel

Looking at this table describing the application startup sequence and the probe states during the different phases or the ApplicationAvailability section, I not sure how we could improve our documentation. Any idea?

Comment From: ichenhe

Personally, I think your explanation just now is very good. For example, we can write like this:

Generally speaking, the "startupProbe" is not necessarily needed here as the "readinessProbe" fails until all startup tasks are done, which means spirng will not receive the request until it is ready. But if your application need a long time to start (not a best practice), please add "startupProbe" to make sure k8s won't kill it in the process of starting.

In this way, we express two views:

There is no need to worry about receiving the request before the startup is successful. (If I understand it correctly, that's what you want.)
If the startup is slow, startupProbe is required.

And prevented the possibility of the misunderstanding likes mine.

Comment From: stefanocke

I would like to mention that a common reason for slow startup (besides heavy load) might be database migration (like flyway), since it happens before the actuator endpoints are available at all. Please correct me if I am wrong.

Comment From: rohanKanojia

@ichenhe @bclozel : Hi all, Sorry for naive question.

I also see a new /actuator/startup endpoint introduced since Spring Boot 2.5.0 . Can this be used as Kubernetes startup probe? Or perhaps is it some misunderstanding on my part.

Comment From: philwebb

@rohanKanojia Please see this section of the docs. If you have any further questions please ask on stackoverflow.com or join us at gitter.im.