Spring PathPatternRouteMatcher - Possible Minor Memory Leak

Affects: v5.3.0

The PathPatternRouteMatcher appears to utilize the below cache to optimize the parsing of Routes:

https://github.com/spring-projects/spring-framework/blob/6825287360e7d881066b7fbd16ab7e970985c225/spring-web/src/main/java/org/springframework/web/util/pattern/PathPatternRouteMatcher.java#L40

Given the cache is implemented as a ConcurrentHashMap it appears that there are no bounds on this. Furthermore, there is no way to define an eviction policy (e.g. LRU) so, as far as I can see, the Map will continue to grow for the life of the JVM, particularly in use cases where there are large numbers of distinct patterns being matched.

I acknowledge that the items in this Map will be small in size so this may represent a minor issue but perhaps defining a (customisable?) eviction strategy would be a useful improvement?

Comment From: bclozel

@phil-applegate did you encounter a case where the memory grows a lot? Are you using the PathPatternRouteMatcher independently of the web infrastructure - could you tell us more about that use case? If you're using it with regular web endpoints, could you tell us more about the number of endpoints so we can think about this?

We've got a LRU-implementation variant in Spring Framework already, and we're using it in places where external inputs are considered (we want to avoid OOM/DoS issues). In this case, we're only considering application inputs. I'm wondering if paying the cost (the LRU variant uses locks) is worth it: all applications would pay that cost for no added benefit in most cases. We can certainly measure the performance difference between the two implementations; note that this is part of a very hot code path and any performance degradation will probably be noticed by the community.

Comment From: phil-applegate

@bclozel thanks for your reply. I haven't encountered a case where the memory grows yet as I'm currently investigating design options for our use case. As part of this, I was having a look at the implementation of PathPatternRouteMatcher to see if it's suitable and consequently came across this issue.

I'm investigating using the PathPatternRouteMatcher independently of the web infrastructure. The use case is to modify outgoing STOMP Messages based on the the destination and the user associated with the subscription. Therefore the PathPatternRouteMatcher could be used in a ChannelInterceptor on the clientOutboundChannel to match a destination against a topic pattern to determine which modifications should be performed.

Given the destinations could contain date parameters (PathVariable), the number of entries in this cache could feasibly grow unbounded (albeit slowly) over time hence raising this issue. Your comment makes sense where the added cost of locks may not give significant gains in the majority of use cases.

Is the design intent with cache in PathPatternRouteMatcher such that the growth, whilst unbounded, will be small enough not to cause any significant performance degradation?

Comment From: bclozel

The PathPatternRouteMatcher is meant to be given routes as arguments, nothing else. This means it's supposed to be fed the routes defined in your source code (as annotations on controller handlers or functional routes), not the actual request paths.

In the case you're mentioning, a route containing a dynamic date should be like "/meetings/{date}". The cache is merely here to avoid re-parsing the same route over and over. The matching process, i.e. matching "/meetings/2020-11-10" against "/meetings/{date}" is done separately on the compiled PathPattern instance directly.

So I don't understand how an application could end up growing that cache, as it should be fed with a bounded number of routes contained in the source code. It should not be fed with external input.

If, for some reason, you'd like to parse user input as patterns, I'd suggest to use the PathPatternParser directly and manage a custom cache in your implementation, if needed.

Comment From: phil-applegate

Thanks for your detailed responses and for clarifying. My understanding of how the PathpatternRouteMatcher should be used was incorrect. Having looked at PathPattern and PathPatternParser, they will be a better fit.

Closing this issue based on the above.