Affects: 5.2.5

If a query has brackets in its values, then it is not completely encoded, but accepted by the URI implementation.

Problem is with build(boolean) method, which works in an inconsistent way.

if parameter is true, then it validates if URI is completely encoded and fails when there are an unencoded [] when parameter is false, then it selectively encodes the URI, specifically it doesn't encode [], producing again invalid URI.

Look at this test:

  @Test
  public void encoding() throws URISyntaxException {

    URI uri = new URI("http://example.com/some/path?query=[from%20to]");
    try {
      UriComponentsBuilder.fromUri(uri).build(true);
      fail("It wasn't completely encoded URI");
    } catch (IllegalArgumentException e) {
      //good
    }

    //ok, then encode it
    uri = UriComponentsBuilder.fromUri(uri).build(false).toUri();
    //now it is double encoded http://example.com/some/path?query=[from%2520to]

    //so is it encoded now?
    UriComponentsBuilder.fromUri(uri).build(true); //fail, no it is not, square brackets are there
  }

Whatever you do, it is not possible to use UriComponentsBuilder on the above uri and get the valid or at least the same uri as a result.

I am not the only one confused: https://github.com/spring-cloud/spring-cloud-gateway/blob/master/spring-cloud-gateway-core/src/main/java/org/springframework/cloud/gateway/filter/RouteToRequestUrlFilter.java#L63


Comment From: eitan613

@molsza I don't think this is a problem. When you pass true into the method then it is supposed to fail since the [ ] are unencoded unsafe characters and you are telling the method it is encoded so it fails the validate check called on line 144 in the HierarchicalUriComponents class (which is called on line 459 of the UriComponents buildInternal method). but when you pass false into the method then there is no reason for it to blow up and it gives you back the URI as is. It isn't supposed to encode it. Unless you call the encode method. The double encoding you note about %2520 is "The reason the first encodes is because the constructor of URI encodes "%"" as discussed in issue #18828. The true flag is supposed to stop that but in that case everything should be encoded already. not the partial encoding you have in your test. when I ran the following modified version of your test it passed (I removed the %20 to represent the space character as I wanted to avoid the double encoding and I didn't put a space instead as the constructer throws an exception then. if you want the space you need to encode it but then you have to encode everything and tell the build method that you did)

@Test
    public void encoding() throws URISyntaxException {
        URI uri = new URI("http://example.com/some/path?query=[fromto]");
        try {
            UriComponentsBuilder.fromUri(uri).build(true);
        } catch (IllegalArgumentException e) {
            System.out.println(e.getMessage()); //Invalid character '[' for QUERY_PARAM in "[fromto]"
        }
        //ok, then encode it
        uri = UriComponentsBuilder.fromUri(uri).build(false).encode().toUri();
        assertThat(uri.toString()).isEqualTo("http://example.com/some/path?query=%5Bfromto%5D");

        //so is it encoded now?
        UriComponentsBuilder.fromUri(uri).build(true); //yes it is
    }

Comment From: rstoyanchev

In addition to what @eitan613 explained, when you use UriComponentsBuilder#fromUri you are passing in a fully encoded URI and after that you can only call .build(true) because it is already encoded.

Now why the java.net.URI constructor didn't reject "[" is unclear to me. Furthermore I confirmed that if you call the java.net.URI constructor with individual components (which does encode) it doesn't seem to encode "[" and "]". So that looks like a bug in java.net.URI.

If starting with a String, you can pass that directly into UriComponentsBuilder and then call .encode(). So this works:

String s = "http://example.com/some/path?query=[from to]";
URI uri = UriComponentsBuilder.fromUriString(s).build().encode().toUri();
System.out.println(uri);
// http://example.com/some/path?query=%5Bfrom%20to%5D

Comment From: x0zh

based on above, i think org.springframework.web.util.HierarchicalUriComponents#toUri should not return java.net.URI directly, because spring implement RFC3986, but java.net.URI implement RFC2396

// java version 1.8.0_292-b10
// spring version: 5.3.7
// org.springframework.web.util.HierarchicalUriComponents.Type#QUERY_PARAM
QUERY_PARAM {
    @Override
    public boolean isAllowed(int c) {
        if ('=' == c || '&' == c) {
            return false;
        }
        else {
            return isPchar(c) || '/' == c || '?' == c;
        }
    }
}

/**
   * Indicates whether the given character is in the {@code pchar} set.
   * @see <a href="https://www.ietf.org/rfc/rfc3986.txt">RFC 3986, appendix A</a>
   */
protected boolean isPchar(int c) {
    return (isUnreserved(c) || isSubDelimiter(c) || ':' == c || '@' == c);
}