Background
I'm developing an event-sourcing library/framework called Occurrent that uses Spring/Boot in some of its modules. I'm now trying to add annotation support for subscriptions by leveraging the Spring eco-system. For example:
@Subscription(id="mySubscription")
void handleMySubscription(MyDomainEvent e) {
..
}
You can also specify a past time when streaming of events should occur, for example:
@Subscription(id="mySubscription", startAt = BEGINNING_OF_TIME, waitUntilStarted=TRUE)
void handleMySubscription(MyDomainEvent e) {
..
}
This will stream all events from the beginning of time and then continue subscribing to new ones as they come, and in this example, we've also specified waitUntilStarted=TRUE
which means that all historic events should be streamed before the Spring Boot application starts serving requests.
And here lies the problem. I've implemented support for the annotation by using a BeanPostProcessor
, you can find the implementation in org.occurrent.springboot.mongo.blocking.OccurrentAnnotationBeanPostProcessor. When the code detects that waitUntilStarted
is true
(see line 179), it calls the code that starts streaming old events. The code of interest starts in SpringMongoEventStore
at line 373 which performs this query:
mongoTemplate.stream(query.with(sort), Document.class, eventStoreCollectionName)
Now, deep down in Spring Boot in org.springframework.beans.factory.support.DefaultSingletonBeanRegistry
, at line 214 (spring-beans-6.1.6-sources.jar!/org/springframework/beans/factory/support/DefaultSingletonBeanRegistry.java:214), in the public Object getSingleton(String beanName, ObjectFactory<?> singletonFactory)
method there's a synchronized block:
public Object getSingleton(String beanName, ObjectFactory<?> singletonFactory) {
Assert.notNull(beanName, "Bean name must not be null");
synchronized (this.singletonObjects) {
Object singletonObject = this.singletonObjects.get(beanName);
...
}
And here my application is stuck forever. This happens when getSingleton
is called for a bean named "springApplicationAdminRegistrar".
I get the same problem with Spring Boot 3.2.5, 3.2.6, and 3.3.0.
Reproduce
You can reproduce this by cloning the Occurrent repository (git clone git@github.com:johanhaleby/occurrent.git
), checking out the issue/spring-boot-deadlock
branch, and then running the main method in org.occurrent.example.domain.numberguessinggame.mongodb.spring.blocking.Bootstrap
from e.g. Intellij. Note you need to start MongoDB locally before you run the class. What will then happen is that the following log will be printed:
2024-05-25T14:23:50.330+02:00 INFO 4289 --- [number-guessing-game] [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port 8080 (http)
2024-05-25T14:23:50.338+02:00 INFO 4289 --- [number-guessing-game] [ main] o.apache.catalina.core.StandardService : Starting service [Tomcat]
2024-05-25T14:23:50.338+02:00 INFO 4289 --- [number-guessing-game] [ main] o.apache.catalina.core.StandardEngine : Starting Servlet engine: [Apache Tomcat/10.1.24]
2024-05-25T14:23:50.369+02:00 INFO 4289 --- [number-guessing-game] [ main] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext
2024-05-25T14:23:50.370+02:00 INFO 4289 --- [number-guessing-game] [ main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 3572 ms
2024-05-25T14:23:50.527+02:00 INFO 4289 --- [number-guessing-game] [ main] org.mongodb.driver.client : MongoClient with metadata {"driver": {"name": "mongo-java-driver|sync|spring-boot", "version": "5.1.0"}, "os": {"type": "Darwin", "name": "Mac OS X", "architecture": "aarch64", "version": "14.5"}, "platform": "Java/Eclipse Adoptium/17.0.6+10"} created with settings MongoClientSettings{readPreference=primary, writeConcern=WriteConcern{w=null, wTimeout=null ms, journal=null}, retryWrites=true, retryReads=true, readConcern=ReadConcern{level=null}, credential=null, transportSettings=null, commandListeners=[], codecRegistry=ProvidersCodecRegistry{codecProviders=[ValueCodecProvider{}, BsonValueCodecProvider{}, DBRefCodecProvider{}, DBObjectCodecProvider{}, DocumentCodecProvider{}, CollectionCodecProvider{}, IterableCodecProvider{}, MapCodecProvider{}, GeoJsonCodecProvider{}, GridFSFileCodecProvider{}, Jsr310CodecProvider{}, JsonObjectCodecProvider{}, BsonCodecProvider{}, EnumCodecProvider{}, com.mongodb.client.model.mql.ExpressionCodecProvider@44592c39, com.mongodb.Jep395RecordCodecProvider@2a87ba34, com.mongodb.KotlinCodecProvider@34d480b9]}, loggerSettings=LoggerSettings{maxDocumentLength=1000}, clusterSettings={hosts=[localhost:27017], srvServiceName=mongodb, mode=SINGLE, requiredClusterType=UNKNOWN, requiredReplicaSetName='null', serverSelector='null', clusterListeners='[]', serverSelectionTimeout='30000 ms', localThreshold='15 ms'}, socketSettings=SocketSettings{connectTimeoutMS=10000, readTimeoutMS=0, receiveBufferSize=0, proxySettings=ProxySettings{host=null, port=null, username=null, password=null}}, heartbeatSocketSettings=SocketSettings{connectTimeoutMS=10000, readTimeoutMS=10000, receiveBufferSize=0, proxySettings=ProxySettings{host=null, port=null, username=null, password=null}}, connectionPoolSettings=ConnectionPoolSettings{maxSize=100, minSize=0, maxWaitTimeMS=120000, maxConnectionLifeTimeMS=0, maxConnectionIdleTimeMS=0, maintenanceInitialDelayMS=0, maintenanceFrequencyMS=60000, connectionPoolListeners=[], maxConnecting=2}, serverSettings=ServerSettings{heartbeatFrequencyMS=10000, minHeartbeatFrequencyMS=500, serverMonitoringMode=AUTO, serverListeners='[]', serverMonitorListeners='[]'}, sslSettings=SslSettings{enabled=false, invalidHostNameAllowed=false, context=null}, applicationName='null', compressorList=[], uuidRepresentation=JAVA_LEGACY, serverApi=null, autoEncryptionSettings=null, dnsClient=null, inetAddressResolver=null, contextProvider=null}
2024-05-25T14:23:50.536+02:00 INFO 4289 --- [number-guessing-game] [localhost:27017] org.mongodb.driver.cluster : Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=REPLICA_SET_PRIMARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=17, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=13759917, setName='rs0', canonicalAddress=localhost:27017, hosts=[localhost:27017], passives=[], arbiters=[], primary='localhost:27017', tagSet=TagSet{[]}, electionId=7fffffff000000000000006c, setVersion=1, topologyVersion=TopologyVersion{processId=66509f38f027211792d092a2, counter=6}, lastWriteDate=Sat May 25 14:23:48 CEST 2024, lastUpdateTimeNanos=82113709085375}
And then the application hangs in the synchronized block mentioned above. If everything was working, you should have been able to navigate to http://localhost:8080/games
.
While creating this issue I wanted to make it easier to reproduce, so I created a TestBoostrap
( in package org.occurrent.example.domain.numberguessinggame.mongodb.spring.blocking
) that uses test containers to boot the application so that you don't need to start MongoDB externally. But the strange thing is that if I run the main method from TestBootstrap
, the deadlock doesn't occur and you can navigate to http://localhost:8080/games
. Could it be a timing issue? If so, it's very deterministic.
Am I doing something wrong or could this be a potential bug?
Comment From: jhoeller
Support for concurrent initialization is limited in Spring up until 6.1. Due to the strong singleton lock, any attempt to do concurrent initialization while waiting for the result in a singleton-locked method has a risk for a deadlock: e.g. starting a thread in a singleton init method and then joining it in the very same method, with the concurrent thread also trying to trigger the creation of a singleton in the container.
As of 6.2, the container comes with a lenient locking which allows for a broader set of scenarios for concurrent initialization: https://github.com/spring-projects/spring-framework/issues/23501
Before 6.2, I'm afraid this is a known limitation of the locking approach used there. You can try to pre-initialize any dependencies before your concurrent thread tries to obtain beans that currently have yet to be created, turning that asynchronous getBean
call into a lookup for an existing bean. This can happen via @DependsOn
or simply through calling getBean
for those dependencies in the main bootstrap thread before starting any asynchronous steps from there.