Expected Behavior
When using ClientRegistrations.fromIssuerLocation("http://192.0.2.0") or any non-responding address[1], the http connect timeout should be "reasonable", in seconds so that users get quick feedback when a issuerUri just doesn't respond. Reasonable may be somewhere between 5 and 10s.
Or the ClientRegistrations.rest RestTemplate could be configurable.
Current Behavior
The current timeout is whatever the default timeout is for RestTemplate / underlying HTTP client implementation. In a "default" Boot application, that timeout is 75s. This means the application will hang for 75 seconds, and Tomcat won't even start up.
That is too long in certain contexts, e.g. a Boot app deployed to Kubernetes. The app would take 75 seconds to respond, which could be longer than the user's readiness probe - the pod would get thrashed without giving useful information to the user.
Context
We have seen this issue in k8s deployments, where our pods would crash because a DNS record was (incorrectly) changed and an OIDC issuer did not respond anymore.
Sample
A Boot app with OAuth2 login enabled, and the following config:
spring:
security:
oauth2:
client:
registration:
default:
client-id: "foo"
client-secret: "bar"
provider: my-provider
provider:
my-provider:
issuer-uri: https://192.0.2.0 # or any other non-responding IP
This can be reproduced programmatically by calling ClientRegistrations.fromIssuerLocation("http://192.0.2.0");
Happy to contribute a PR.
Footnotes
[1] Used 192.0.2.0 as RFC5737 - IPv4 Address Blocks Reserved for Documentation, works as an example for MacOS but I haven't tried on other systems.
Comment From: sjohnr
Hi @Kehrlann! I think this is closely related to gh-8882, which aims to solve this for a variety of components in the OAuth 2.0 landscape.
See ClientRegistrations section, which states:
NOTE: The underlying HTTP Client used in ClientRegistrations was purposely encapsulated and there is no plan to expose it.
So it does seem that while there are a number of components that may require customizing the RestTemplate, in this particular case it has been decided it would not be addressed. There are a lot of issues (many closed as duplicate or for other reasons) that could be researched to put together the consensus on this, but I think a fair summary would be:
ClientRegistrations.fromIssuerLocation is a convenience around configuring a handful of dynamic properties explicitly. If timeouts and other network/infrastructure concerns make it impractical to rely on for configuration, manual configuration is still fairly practical.
Based on a quick check the code, it seems there are only something like 5 dynamic properties: clientAuthenticationMethod, authorizationUri, tokenUri, jwkSetUri and sometimes userInfoUri. (The issuerUri is obviously set by the value you provide already.) Is it infeasible in your environment to set these properties in a more reliable way that doesn't rely on this utility method? For example, do the values tend to change from host to host often enough that a static utility method based on the issuer you already provide falls apart quickly? Any other ideas for a workaround?
Comment From: Kehrlann
Not exposing the underlying http client makes sense 👌
In the general case, I think the k8s-based case I mentioned above can be a real problem for users, as something that works on your local dev machine may have different network config in your target k8s cluster, and so you may end up with a difficult to debug situation.
Spring Boot uses that utility class for autoconfig, and so I think the general UX would be better (for edge cases) with smaller timeouts. spring-security can decide what a reasonable timeout should be (I don't have an informed opinion other than "75s is too long").
For example, do the values tend to change from host to host often enough that a static utility method based on the issuer you already provide falls apart quickly? Any other ideas for a workaround?
We use spring-security + spring-authorization-server to build a federated authorization server, that users can configure on their own upstream identity-providers. It is packaged as a k8s deployment, that users supply config to. So we can't know those properties in advance.
We do want to provide it as a configuration mechanism, rather than asking our users to spell out all of the data:
- our users, who may not be very oauth2-savvy, only have to find the URL that ends with /.well-known/openid-configuration
- it makes the integration guides we may write smaller (e.g., to integrate with Google do this and put issuer-uri: accounts.google.com/.well-known/openid-configuration, for okta do that and put issuer-uri: https://YOUR-TENANT.okta.com/oauth2/default/.well-known/openid-configuration)
Our solutions with the current setup would be to get inspiration from the OIDC auto-config in spring-security and reimplement ClientRegistrations#oidc.
Comment From: sjohnr
Thanks @Kehrlann, I see your point. I only worry about how to manage a decision like this among many other parts of the code base that utilize RestTemplate or WebClient. Would we need to apply this on a wider scale?
If you want to take a crack at a PR to adjust the timeouts I don’t think it can hurt to see what it looks like. Do you want to try your hand at it?
Comment From: Kehrlann
@sjohnr here's the PR - #11232 . I have not added tests - I am unsure where they would go, and I'm unsure how to tweak the values in the test so that they fail faster than 10s.
I've looked into the 75s timeout - the actual timeout declared in Java code is 0, "infinite" - I guess the 75s timeout I experienced was an OS-level socket timeout, see:
$ curl 192.0.2.0
curl: (28) Failed to connect to 192.0.2.0 port 80 after 75005 ms: Operation timed out
$ sysctl net.inet.tcp.keepinit # ha-ha, here it is
net.inet.tcp.keepinit: 75000
Note: since the default rest template uses URLConnection under the hood, timeouts can be controlled through the sun.net.client.defaultConnectTimeout system property but I am unsure how wide-ranging the implications would be on a simple app. I've validated it works with -Dsun.net.client.defaultConnectTimeout=5000.
Comment From: jgrandja
@Kehrlann Please see this comment as this may help your scenario. Any feedback would be greatly appreciated.
Comment From: Kehrlann
This works perfectly for my use-case, thanks! Closing this in favor of possible changes outlined in #8882 .