hi,

was wondering, is there a reason that path segments are trimmed in HierarchicalUriComponents.FullPathComponent?

https://github.com/spring-projects/spring-framework/blob/aaf33100d9fdfa11917f76048a71ca21f3eace3b/spring-web/src/main/java/org/springframework/web/util/HierarchicalUriComponents.java#L900

for example the following test will fail

    @Test
    void pathWithLeadingSpace() {
        String theUrl = "https://path.com/seg1/seg2/ seg-with-leading-space/file.png";
        UriComponents uriComponents = UriComponentsBuilder.fromUriString(theUrl).build();
        assertThat(uriComponents.getPathSegments()).hasSameElementsAs(
                List.of("seg1", "seg2", " seg-with-leading-space", "file.png")
        );
    }

mind you, .toUri() still returns the correct thing (https://path.com/seg1/seg2/%20seg-with-leading-space/file.png), it's just that it seems weird to store the "wrong" path components.

looks like a simple change for the above line like

String[] segments = StringUtils.tokenizeToStringArray(getPath(), PATH_DELIMITER_STRING, false, true);

should do the trick.

insights?

thanks,

Comment From: rstoyanchev

It is meant more for building URIs and normalizing the input, and is applied when providing a path string. If you provide path segments individually, they are used as is. And if you use DefaultUriBuilderFactory (thin wrapper around UriComponentsBuilder with more control over parsing and encoding modes), it provides better defaults for parsing, for example splitting the URL into path segments. The following works:

@Test
void pathWithLeadingSpace2() {
    String theUrl = "https://path.com/seg1/seg2/ seg-with-leading-space/file.png";

    DefaultUriBuilderFactory uriBuilderFactory = new DefaultUriBuilderFactory();
    URI uri = uriBuilderFactory.uriString(theUrl).build();

    assertThat(uri.getPath()).isEqualTo("/seg1/seg2/ seg-with-leading-space/file.png");
}

You can read here for more detail around these types and their relationship. I would also add if possible to parse the URLs in encoded form, if possible.

I'm closing this for now, as this is expected behavior, but feel free to comment.