Hi,

I started hacking a solution that would close Cassandra connections on checkpoint, following [1] and registering a Lifecycle that would let the Cassandra session 'suspend'. The checkpoint and restore works in my prototype when I issue the checkpoint via jcmd <pid> JDK.checkpoint, but if I try automatic checkpoint through -Dspring.context.checkpoint=onRefresh the Lifecycle.stop() is not called: at this point DefaultLifecycleProcessor.running is false and stopForRestart() is not invoked.

I would like to ask what's the reason and if there's any better pattern; [2] explains that the connections are established too early (before first refresh). Is that a show-stopper? We can close those connections and keep going.

An alternative solution would be to not use the Lifecycle and register as a org.crac.Resource; however I understand that it's preferred to integrate the handling into Spring using that rather than directly.

CC @sdeleuze

[1] https://github.com/spring-projects/spring-boot/blob/main/spring-boot-project/spring-boot-autoconfigure/src/main/java/org/springframework/boot/autoconfigure/jdbc/DataSourceCheckpointRestoreConfiguration.java [2] https://github.com/spring-projects/spring-data-cassandra/issues/1486

Comment From: jhoeller

Generally, Lifecycle.stop() is only invoked for components that received a Lifecycle.start() callback before, i.e. only what got started actually needs to be stopped. As long as the activity that you intend to stop in stop() only gets started in start(), you should be fine.

This is also the pattern followed for an onRefresh checkpoint: We aim to not even start the components there yet, in order to avoid the need for explicit stopping to begin with. Only for later checkpoints at runtime, there is an actual need to stop components.

For core framework components, we explicitly have all such activity in start() methods rather than regular init methods for that reason. In that sense, init methods are mirrored by destroy methods (matching bean instance creation/destruction), while start methods are mirrored by stop methods (for dynamic activities that can get started/stopped/restarted in the same bean instances).

Comment From: rvansa

@jhoeller Thanks for those answers! The symmetry makes sense - could you make any suggestion for integration of components that don't have a proper lifecycle and do startup as part of bean instantiation? I've checked Cassandra code and the CqlSession (DefaultSession) starts the connections from constructor eagerly, while it serves as a bean, too, so it's instantiated in order to be injected. Not sure about proxy options here. I might be able to override a handful of components to enforce a lazy start, but that feels rather intrusive.

I've placed a few breakpoints into the starting application and it seems that DefaultLifecycleProcessor.start() is not invoked at all (.refresh() is invoked from ApplicationContext <- SpringApplication.run), it becomes running by the refresh, only the SmartLifecycle beans get started as part of the refresh. So effectively this is breaking the concept that a component has to be started to become running.

I've implemented the lifecycle with running = true from the start, which is probably a hack. However since component stopping already checks if Lifecycle.isRunning(), it sounds safe to ignore the status of DLP itself (just a little excessive to go through the beans to see that, in the general case, none is started).

Comment From: sdeleuze

@rvansa Is the CqlSession exposed as a Spring bean by Spring Data/Boot ?

Comment From: rvansa

@sdeleuze Yes, it is produced by CassandraAutoConfiguration:

        @Bean
    @ConditionalOnMissingBean
    @Lazy
    public CqlSession cassandraSession(CqlSessionBuilder cqlSessionBuilder)

Comment From: sdeleuze

If am not sure we have something actionable on Spring Framework side. I mean, the choice to expose directly beans using a third-party class which open a connection very early at constructor level is not a good fit with training run use cases. I see 2 potential ways to solve this.

Either at Spring Boot level (see related issue above) or Spring Data level, the CqlSession is wrapped into a proper lifecycle bean (which would have side effects).

Or alternatively, on Cassandra side you can either ask them a way to not create the connection eagerly in the constructor, or ask them to introduce an org.crac.Resource internally.

@rvansa Are you fine with us closing this issue?