SpringBoot Circular reference in JPA, DataSource, initializers

This was originally reported by a Spring Cloud user (someone trying to upgrade the Petclinic microservice edition to Spring Boot 2.0 - so it works with 1.5): https://github.com/spring-cloud/spring-cloud-commons/issues/355.

I'm still a bit mystified as to why it happens, but the evidence is pointing to something in the DataSourceInitializer and its configuration (there's a BeanPostProcessor that forcibly instantiates the DataSourceInitializerInvoker when it sees a DataSource).

Here's a really small app that blows up in the same way (with no Spring Cloud dependencies). You need a schema.sql and a data.sql to make it fail, and you need to set spring.datasource.initialization-mode=embedded (or always) to trigger the initialization. Dependencies are just h2 and the JPA starter.

@SpringBootApplication
public class BeanApplication {

    @Bean
    public static BeanFactoryPostProcessor fooScope() {
        return beanFactory -> {
            beanFactory.registerScope("foo", new SimpleThreadScope());
        };
    }

    @Bean
    @Scope("foo")
    @ConfigurationProperties(prefix = "spring.datasource.hikari")
    public HikariDataSource dataSource(DataSourceProperties properties) {
        HikariDataSource dataSource = properties.initializeDataSourceBuilder()
                .type(HikariDataSource.class).build();
        return dataSource;
    }

    public static void main(String[] args) {
        SpringApplication.run(BeanApplication.class, args);
    }

}

If you (exclude= {HibernateJpaAutoConfiguration.class, JpaRepositoriesAutoConfiguration.class}) the problem goes away. Here's some code in github: https://github.com/scratches/datasource-cycle.

While I was investigating the Cloud issue I also managed to find some combinations of auto configurations that led to other cycles (not involving JPA, but still with a scoped data source), so there is more than one way to tickle this.

Comment From: snicoll

there's a BeanPostProcessor that forcibly instantiates the DataSourceInitializerInvoker when it sees a DataSource

There is one in 1.5.x as well si I am not sure how that's relevant to the description.

I guess we need to figure out what has changed with regards to the scope thing.

Comment From: dsyer

This fails with 1.5 (in a different way):

@SpringBootApplication
public class BeanApplication {

    @Bean
    public static BeanFactoryPostProcessor fooScope() {
        return beanFactory -> {
            beanFactory.registerScope("foo", new SimpleThreadScope());
        };
    }

    @Bean
    @Scope("foo")
    @ConfigurationProperties(prefix = "spring.datasource")
    public DataSource dataSource(DataSourceProperties properties) {
        return DataSourceBuilder.create().url(properties.determineUrl())
                .type(org.apache.tomcat.jdbc.pool.DataSource.class).build();
    }

    public static void main(String[] args) {
        SpringApplication.run(BeanApplication.class, args);
    }

}

It's on a branch 1.5.x in the same repo.

Comment From: wilkinsona

The 1.5.x version fails because the DataSourceInitializer ends up with this.dataSource being null. This happens because this.applicationContext.getBeanNamesForType(DataSource.class, false, false) returns an empty array. The array is empty because the dataSource bean definition doesn't match. The bean definition doesn't match because it's scoped which causes org.springframework.beans.factory.support.AbstractBeanFactory.isSingleton(String) to return false. This matters because of this logic in Framework.

As far as I can tell, there's no circular reference problem in this case. The problem is that calling this.applicationContext.getBeanNamesForType(DataSource.class, false, false), as DataSourceInitializer does, will never be able to find a DataSource that's in a scope. You can see this by removing the data.sql and schema.sql scripts from the 1.5.x branch of the sample and changing its main method to the following:

public static void main(String[] args) {
    String[] dataSources = SpringApplication.run(BeanApplication.class, args).getBeanNamesForType(DataSource.class, false, false);
    System.out.println(Arrays.toString(dataSources));
}

It will output an empty array.

Comment From: wilkinsona

The 2.0.x version fails because of a difference in how singleton beans and prototype beans are handled when they are requested while still be created. In the singleton case, there's some fallback logic that allows the singleton that's in creation to be retrieved. In the prototype case, there's no such fallback and a BeanCurrentlyInCreationException is thrown.

Comment From: wilkinsona

In SPR-16783, @jhoeller said that "the recommendation is to never access an object within its own post-processing phase to begin with". This is exactly what DataSourceInitializerPostProcessor does when it gets the DataSourceInitializer (1.5) or DataSourceInitializerInvoker (2.0) beans. Putting aside Framework's inconsistent handling of retrieving a singleton that's in creation vs retrieving a prototype that's in creation, we probably need to find another approach anyway.

Comment From: dsyer

The "singleton" in question isn't the same as the default scope for beans (I think). It's "raw" singletons (the ones that are registered directly as instances, not bean definitions). The DataSource isn't in that category. I'm not sure how that helps, though.

Comment From: wilkinsona

The "singleton" in question isn't the same as the default scope for beans (I think). It's "raw" singletons (the ones that are registered directly as instances, not bean definitions).

I'm pretty sure it's the same. When I remove the @Scope and step through this code, the dataSource bean is marked as being in creation but it's still accessible.

Comment From: dsyer

OK, maybe the raw singletons get cached in the same place as the bean definition ones.

I added some non-hibernate samples to my repro project, here: https://github.com/scratches/datasource-cycle/tree/master/src/main/java/org/springframework/boot/autoconfigure/jdbc. They are easier to analyse (fewer beans), but the exception is slightly different.

Comment From: wilkinsona

I added some non-hibernate samples to my repro project

Thanks. BrokenApplication exhibits the same problem as I described above, failing because the dataSource prototype is in creation when an attempt is made to retrieve it.

I'm intrigued by the other two as they look like they should fail in the same way. Time to do some digging.

Comment From: wilkinsona

NotBrokenApplication is not broken because of an ordering change in the configuration (DataSourceInitializationConfiguration going before DataSourceJmxConfiguration). This ordering change means that afterPropertiesSet on DataSourceInitializerInvoker is called outside of the post-processing of the DataSource. This means that the attempt to retrieve the DataSource while it's being created is avoided.

Comment From: wilkinsona

AnotherNotBrokenApplication appears to not be broken for the same reason as NotBrokenApplication. Updating the ordering of the imported configuration classes so that DataSourceJmxConfiguration goes before DataSourceInitializationConfiguration causes it to fail due to a cycle.

Comment From: wilkinsona

When the 2.0.x version of BeanApplication fails, it does so because the early retrieval of LoadTimeWeaverAware beans has triggered the creation and post-processing of the DataSource bean before DataSourceInitializerInvoker has been created because the LocalContainerEntityManagerFactoryBean is load time weaver-aware and depends on the DataSource. This leads to DataSourceInitializerInvoker bean created during the post-processing of the DataSource bean.

The problem can be avoided by making DataSourceInitializerInvoker implement LoadTimeWeaverAware. Due to auto-configuration ordering, it's defined before the definition of the LocalContainerEntityManagerFactoryBean so, if it's LoadTimeWeaverAware, it's created and initialised first which allows it to retrieve the DataSource when it isn't already under creation.

Comment From: spencergibb

Thanks for all the research @wilkinsona and @dsyer

Comment From: dsyer

if it's LoadTimeWeaverAware, it's created and initialised first

That would fix the issue in Spring Cloud because the other BrokenApplication cycle can be broken (for reasons I do not understand) just by declaring the Scope as a @Bean directly instead of using a BeanDefinitionRegistry.

Comment From: wilkinsona

I've yet to convince myself that implementing LoadTimeWeaverAware is the right thing to do. It causes the DataSource to be initialized very early in all applications, irrespective of whether or not they use JPA. On the other hand, if it works in a JPA app, it should probably work elsewhere too. It just feels wrong to be deliberately causing early initialisation of the DataSource… It also feels like a bit of a hack as we don't actually care about the LoadTimeWeaver. However, as far as I can tell, there's no less hacky way for us to trigger things early.

Comment From: dsyer

Tough decision. I suppose we could have 2 identical invokers, one that is LoadTimeWeaverAware and one that is not, and only create a bean for the first one when JPA is active. With a base class doing all the work it wouldn't be completely horrible.

Comment From: kentoj

Could this be due to a change introduced by the Hikari connection pooler? If I use Apache DBCP2 with spring.datasource.type: org.apache.commons.dbcp2.BasicDataSource and have the spring.datasource.initialization-mode: always then everything starts up fine and both the scripts are run properly. However, if I use the default of Hikari then things break.

Comment From: wilkinsona

That's very interesting. Thank you, @kentoj. Moving to BasicDataSource appears to help in some scenarios but not in others. Using 2.0.2.RELEASE, this fails:

@SpringBootApplication
public class BeanApplication {

    @Bean
    public static BeanFactoryPostProcessor fooScope() {
        return beanFactory -> {
            beanFactory.registerScope("foo", new SimpleThreadScope());
        };
    }

    @Bean
    @Scope("foo")
    @ConfigurationProperties(prefix = "spring.datasource.dbcp2")
    public BasicDataSource dataSource(DataSourceProperties properties) {
        BasicDataSource dataSource = properties.initializeDataSourceBuilder()
                .type(BasicDataSource.class).build();
        return dataSource;
    }

    public static void main(String[] args) {
        SpringApplication.run(BeanApplication.class, args);
    }

}

But this does not:

@SpringBootConfiguration
@EnableConfigurationProperties(DataSourceProperties.class)
@Import({ DataSourceConfiguration.Dbcp2.class, DataSourceJmxConfiguration.class,
        DataSourceInitializationConfiguration.class})
public class BrokenApplication {

    @Component
    protected static class RefreshScopeConfiguration
            implements BeanDefinitionRegistryPostProcessor {

        @Override
        public void postProcessBeanFactory(ConfigurableListableBeanFactory beanFactory)
                throws BeansException {
        }

        @Override
        public void postProcessBeanDefinitionRegistry(BeanDefinitionRegistry registry)
                throws BeansException {
            registry.registerBeanDefinition("fooScope",
                    BeanDefinitionBuilder.genericBeanDefinition(FooScope.class)
                            .setRole(BeanDefinition.ROLE_INFRASTRUCTURE)
                            .getBeanDefinition());
        }
    }

    @Bean
    @Scope("foo")
    @ConfigurationProperties(prefix = "spring.datasource.dbcp2")
    public BasicDataSource dataSource(DataSourceProperties properties) {
        BasicDataSource dataSource = properties.initializeDataSourceBuilder()
                .type(BasicDataSource.class).build();
        return dataSource;
    }

    public static void main(String[] args) {
        SpringApplication.run(BrokenApplication.class, args);
    }

}

class FooScope implements org.springframework.beans.factory.config.Scope, BeanFactoryPostProcessor {

    @Override
    public Object get(String name, ObjectFactory<?> objectFactory) {
        return objectFactory.getObject();
    }

    @Override
    public Object remove(String name) {
        return null;
    }

    @Override
    public void registerDestructionCallback(String name, Runnable callback) {
    }

    @Override
    public Object resolveContextualObject(String key) {
        return null;
    }

    @Override
    public String getConversationId() {
        return null;
    }

    @Override
    public void postProcessBeanFactory(ConfigurableListableBeanFactory beanFactory)
            throws BeansException {
        beanFactory.registerScope("foo", this);
    }

}

Comment From: wilkinsona

BrokenApplication is no longer broken with DBCP2 as the switch has been enough to prevent the DataSource from being retrieved while its prototype is in creation. I don't yet know why.

BeanApplication continues to be broken due to the special handling of LoadTimeWeaverAware beans described above.

Comment From: snicoll

For the record, a workaround has been implemented in Spring Cloud for this, see https://github.com/spring-cloud/spring-cloud-commons/commit/0e259eab355be0bff2791715321bdab828cf0422

Comment From: wilkinsona

I've opened a Framework issue so that we can address this without resorting to the LoadTimeWeaverAware hack. This may mean that fixing this ends up being a duplicate of https://github.com/spring-projects/spring-boot/issues/9528, but let's keep this issue open for now so it doesn't fall through the cracks.

Comment From: wilkinsona

The Framework issue is unlikely to make it into 5.1 so we'll need to see if we can do this with a BeanPostProcessor after all.

My current thinking is that we can introduce a bean that acts as a registry for DataSource beans that need to be initialized. When the bean is being defined they're registered with the registry. Via a bean post-processor the registry then initialises the bean after they've been created. This will remove the need for the bean post-processor to look up other beans and scope DataSource initialisation down to those data sources that have opted into it, while allowing initialisation to take place after binding of the DataSource's properties etc.

That's the theory. It remains to be seen if it'll work in practice.

Comment From: wilkinsona

With thanks to @dsyer and @snicoll for their feedback thus far, here's an implementation of the approach I described above. It seems to work in that spring-boot-autoconfigure builds cleanly.

Comment From: wilkinsona

Unfortunately there are more worms in the can. The proposal works up until we get to @JdbcTest and @AutoConfigureTestDatabase. At that point it gets very hard to figure out if the replacement database should be initialised and, if it should, what properties should be used for initialisation. @philwebb had the idea of moving to a declarative @InitializeDataSource-based approach that would help quite a bit, but won't get us the whole way. There are still a number of problems with data source initialisation that we'd really like to address: the blurring of properties used to create the DataSource bean and properties that are used purely for initialisation, differences in when initialisation will actually create a whole new DataSource and when it will reuse one, etc.

Comment From: zhengzizhi

Spring Boot 2.0.4.RELEASE

Spring Boot 2 initializes MySQL database script data. sql & schema. sql, which must provide three sets of username and password. This is not mentioned in the official website and the whole Internet. The correct configuration format is as follows：

spring: profiles: development
datasource: url: jdbc:mysql://192.168.99.100:3306/dev?useSSL=false username: root password: dbpass schema-username: root schema-password: dbpass data-username: root data-password: dbpass platform: mysql schema: classpath:schema.sql data: classpath:data.sql continue-on-error: true initialization-mode: always

Comment From: zhengzizhi

jdbc-init-sql-script

Comment From: zhengzizhi

192 168 10 13-2018-08-08-11-50-48

Comment From: xcbeyond

@dsyer @wilkinsona Have you solved this problem? if not,i can only user spring cloud 1.5,is that it?

Comment From: wilkinsona

@xcbeyond As noted above, a workaround has been implemented in Spring Cloud.

Comment From: wilkinsona

the blurring of properties used to create the DataSource bean and properties that are used purely for initialisation

We'd like to fix this by introducing an init sub-group to collect together the properties that are specifically to do with initialisation.

differences in when initialisation will actually create a whole new DataSource and when it will reuse one, etc.

This occurs when a specific username and/or password is provided for the initialisation of the DataSource. When it occurs, a new DataSource is created with the default settings. This can lead to a pool with many connections being created purely for initialisation. We'd like to address this by tuning any pool that's created specifically for initialisation to have a more sensible number of connections (probably 1).

moving to a declarative @InitializeDataSource-based approach

This approach would, even when replacing a bean, would allow us to know if the user wanted the datasource to be initialised. The annotation could have an attribute that specifics the prefix for the properties that configure datasource initialisation. These same properties could be used to apply the initialisation to an auto-configured test database.

Comment From: samo-esure

I'm able to work around this issue by specifying management.server.port different than the server.port

Comment From: klopfdreh

Hey all,

we are also facing this issue and found out that beside the already mentioned information spring-boot 2.0 is using org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator which tries to autowire the javax.sql.Datasource map which contains the datasource that is currently not being marked as initialized. As a workaround to not change the management port to a different one you could also consider to deactivate the db health indicator.

management.health.db.enabled=false

Afterwards you can define your own HealthChecks with the org.springframework.boot.actuate.health.HealthIndicator interface and test your datasources by your own.

Edit: The HealthIndicator is also not working because of injection of the datasources.

Hope to see the issue fixed soon.

Comment From: wilkinsona

@klopfdreh Thanks for the information. This is the first time that Actuator has apparently been involved in this issue. Could you share a sample that reproduces the problem you are describing? I'd like to see if it's a variant of one of the problems described above, or another case that we haven't seen before.

Comment From: klopfdreh

Hi @wilkinsona - As mentioned in the original issue description and the mentioned link: spring-cloud/spring-cloud-commons#355 you can see that the stack trace is showing up the actuator endpoints stack strace entries - e.g.:

org.springframework.boot.actuate.autoconfigure.jdbc.DataSourceHealthIndicatorAutoConfiguration.getValidationQuery(DataSourceHealthIndicatorAutoConfiguration.java:115)

and the following lines.

Sadly I can't provide the source code but I just wanted to give the hint about the actuator.

Comment From: wilkinsona

@klopfdreh What version of Spring Cloud are you using? That part of the problem should have been fixed already by https://github.com/spring-cloud/spring-cloud-commons/commit/0e259eab355be0bff2791715321bdab828cf0422.

Comment From: klopfdreh

@wilkinsona - no spring cloud, but spring-boot 2.1.x, but there the actuator endpoint is also affected. Could be that this is a project related issue. If I have more information I can share I am going to do so.

Comment From: wilkinsona

Thanks. Are you doing anything with your own custom scopes or prototype beans? IIRC, the various examples listed above all require one or the other for the problem to occur. You may be facing the same problem as described in this issue, or it may be something else. We've yet to see Actuator be the cause of the problem and I suspect it's not in your case and that what you're seeing is just a symptom.

Unfortunately, we won't be able to do any more than speculate without having some code to look at so fingers-crossed that you're able to share something. It doesn't have to be your exact code, just enough to reproduce the problem that you're seeing.

Comment From: klopfdreh

I just tried to update to spring-boot 2.1.5 but facing the same error.

I just had a look in the initialization of the custom DataSource bean. The bean itself is created within a class using @Bean and checking against the same bean name (method name) with @ConditionalOnMissingBean. The class itself is a @Configuration with @ConditionalOnProperty.

In this case I guess the default scope is used (singleton) and it is not a prototype bean.

All in all there is no specific scope handling introduced.

I know that you can't do anything more without stacktrace, but I just wanted to provide those information about the actuator endpoint and that this is not only occuring in spring-cloud.

This is only an assumption, but is there any specific handling for datasources in kind of bean creation? This error seems to occure only with an additional DataSource provided as bean and in this particular scenario with the actuator endpoint, because @samo-esure also mentioned to change the managament port works (this also worked for me) - also the spring-cloud issue listed this in the stack. Also to disable the health endpoint or the db indicator worked.

So I assume that this is an issue around the DataSource injection within the actuator implementation or an issue related to the startup of this spring module.

Comment From: klopfdreh

@wilkinsona - We found the issue and it is really hard to explain in detail. Basically it was an issue that the health endpoint was causing the DataSource bean to be created way earlier than without. Because of that shift in the order of the bean creation the configuration class in which the datasource was created was invoked a bit earlier and a setter (also autowired with required false) was resolved. Finally this setter was causing the circular dependency.

So the error was missleading and might also misslead in other scenarios to the assumption that the actuator health endpoint is the issue.

So my advice would be to ask devs facing this issue - for those who not defined custom scopes - to have a look in the context at which a custom datasource is created and search for circular dependencies in that area (especially with optional beans).

Comment From: jebeaudet

We just ran into this while trying to use org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource with some custom logic to which datasource each query will hit. I've put up a small sample here.

There are 2 configurations classes, ActualConditionsWeRanInto and DeadSimpleReproducingConditions, just enable any one of them to reproduce the problem.

The key here I believe is the @Primary annotation, the DataSourceInitializerInvoker gets invoked on the first DataSource created (not the primary one) and tries to create the primary instance in the ObjectProvider<Datasource>#getIfUnique method invocation, creating the cycle.

Our fix for now is to disable the DataSourceAutoConfiguration class but I'd rather avoid that, anything else we could do? We don't use initialization and the spring.datasource.initializationMode property gets checked after the problem so setting it to NEVER has no impact.

Comment From: wilkinsona

This is noteworthy as the release notes should mention the new spring.datasource.initialization-order that controls whether initialization is performed before or after JPA.