Hi, i'm the maintainer of HtmlUnit and i like to finally fix the issues #25768 and https://github.com/HtmlUnit/htmlunit/issues/223.
Did some analysis during the last day's - sorry for the long text here but i like to make my view on this clear and hopefully we can discuss and find a good solution. Did my analysis based on the sample here - https://github.com/thuri/htmlunit-test-project/tree/JQueryAjax
Lets start with the WebRequest class from HtmlUnit: * this is more or less an internal class * the api of that class is really bad designed ** parameter and body are exclusive - you can only have one ** the usage of the class requires some internal knowledge of how the request got filled ** many more
Now the problem - from my point of view: org.springframework.test.web.servlet.htmlunit.HtmlUnitRequestBuilder.params(MockHttpServletRequest, UriComponents) only takes parts of the (internal) logic of WebRequest into account; at the moment only the url and the parameters are checked when reading the parameters. There are more cases where the parameters are url-encoded in the body - these cases are the reason for the misbehavior.
Possible solutions Option 1: To fix the problem we can improve the HtmlUnitRequestBuilder.params implementation to reflect the whole functionality used by HtmlUnit But i think this is not really maintainable - every time HtmlUnit changes parts of the internal magic here you have to fix something.
Option 2 (my preferred one): We agree on a more stable interface for WebRequest that hides the internal parameter handling and always returns all the parameter (key value pairs) like e.g. the servlet api does. HtmlUnitRequestBuilder can use this method and gets always all params.
I have already prototyped this solution (breaking HtmlUnits backwardcomatibility) by changing the com.gargoylesoftware.htmlunit.WebRequest.getRequestParameters() method in the way described above (the impl is available as 2.61.0-SNAPSHOT). This fixes the problems but introduces a new one because the url paramters are doubled now (because HtmlUnitRequestBuilder.params(MockHttpServletRequest, UriComponents) handles this params already).
Because of this i like to get some feedback from you how we should proceed here.
Or to summarize * i like to change HtmlUnit WebRequest to provide an API that returns always all request parameters (at best without breaking backward compatibility) * you change the impl of HtmlUnitRequestBuilder.params() to use only this method
Many thanks Ronald (rbri at rbri.de)
Comment From: rstoyanchev
@rbri thanks for the analysis and proposals. Yes, it would be great to encapsulate this better within HtmlUnit and we'll work with you to use the new API. Ideally, we'd be able to detect the new API and use it conditionally for now, through reflection, in order to avoid a hard requirement for the HtmUnit version and allow a more lenient upgrade.
I'll re-purpose this issue to explore this change. Please, let us know when you have something we can experiment with.
Comment From: rbri
@rstoyanchev sounds great, will update the current snapshot build and introduce a new method. Then you can check for the existence of this public method. OK?
Comment From: rbri
Have made a new snapshot build - 2.61.0-SNAPSHOT. There is a new method
com.gargoylesoftware.htmlunit.WebRequest.getParameters()
public List<NameValuePair> getParameters()
This method returns always all parameters used bei HtmlUnit when executing the web request (at least if i have done everything right).
From my point of view you have to skip the processing of uriComponents.getQueryParams() if you use the new method. The second loop on this.webRequest.getRequestParameters() has to be replaced by this.webRequest.getParameters().
@rstoyanchev Please try and report any findings / incompatibilities. Will try to write some more unit tests to prove my impl.
Comment From: rbri
@rstoyanchev any news here? i plan a new release for the weekend and like to know if this is working
Comment From: rbri
HtmlUnit Version 2.61.0 is now available.
Comment From: rbri
Hi @bclozel, @rstoyanchev, anything i can do to move this forward (other than some friendly reminders like this)?
Comment From: bclozel
Sorry for the radio silence @rbri, we were pretty busy with the upcoming major versions and recent CVEs.
I've tested the new API and its usage in Spring Framework in a reflective fashion. So far two pieces of feedback:
- this forces us to avoid using the
setRequestParameters
API in our tests, which might be a good thing anyway - we're seeing a subtle change of behavior when we're using the new API. With the old API, both
"https://example.com/example/?name="
and"https://example.com/example/?name"
would give a keyvaluepair"name"
,""
. With the new API,"https://example.com/example/?name"
gives a keyvaluepair"name"
,null
. Not a huge behavior change, but something we noticed in our tests. Any comment on that?
Other than that, I think we could ship this change in the next 5.3.x release. Judging from the changes, we wouldn't really need to enforce an HtmlUnit 2.61+ requirement in Spring Framework 6.
Comment From: rbri
this forces us to avoid using the setRequestParameters API in our tests, which might be a good thing anyway
From my point of view this is an error in my current implementation. Sending a get request having requestparameters set results in an updated (overwritten query part) url for the get request. Have fixed this.
Comment From: rbri
we're seeing a subtle change of behavior when we're using the new API
So far my understanding from your impl is: * spring shortcuts the server roundtrip; instead of sending the request out, spring intercepts the request and uses the request as input for the server side spring api. So this way misses the conversation from the HtmlUnit WebRequest into the HttpClient WebRequest, the conversation from the HttpClientWebRequest into the real HttpRequest on the wire and finally the conversation back from the bytes on the wire into the ServletWebRequest.
I'm correct here?
What we try now is to provide a method (getParameters()) that produces the same parameters as we get when going th long way. Have to think a bit about a test setup here for this.
Comment From: rbri
Have written some tests and it looks like normalizing null to empty strings is a good way. Will make a new snapshot build and inform you.
Comment From: rbri
@bclozel made a new snapshot build - please try
Comment From: rbri
Again more than a month gone without feedback. The HtmlUnit changes are already released.
Comment From: rbri
@bclozel any news - anything i can do to move this forward?
Comment From: bclozel
Given the 5.3.x timeline, I've moved this enhancement to the next 6.0 milestone. We'll raise the HtmlUnit baseline to 2.64.0. Thanks @rbri !