Skip to content

Regression in 1.4.1: path_and_query rejects authority-form and empty inputs #839

@zrbecker

Description

@zrbecker

Summary

PR #826, shipped in 1.4.1, tightened the validator inside PathAndQuery's constructors so that inputs not starting with /, ?, or # are rejected, and empty strings are rejected separately. This is a behavior change for inputs that have been accepted since 1.0.

Reproduction

http::Uri::builder().path_and_query("127.0.0.1:8080").build()
// 1.4.0: Ok(127.0.0.1:8080)
// 1.4.1: Err(http::Error(InvalidUri(PathDoesNotStartWithSlash)))

http::uri::PathAndQuery::try_from("")
// 1.4.0: Ok(/)
// 1.4.1: Err(InvalidUri(Empty))

Impact

The most visible breakage is in hyperium/hyper v1.9.0, whose CONNECT tests (client_connect_method and client_connect_method_with_absolute_uri) now fail. That, in turn, is what's surfacing the regression in tokio's CI (example run), since tokio runs hyper's test suite as a downstream integration check. Any other crate that constructs CONNECT request-targets through Uri::builder().path_and_query(...) is likely to be affected as well.

Discussion

Hyper's use of path_and_query to hold a CONNECT authority-form (host:port) is technically a stretch: the documentation only shows /foo?bar-style examples. There are cleaner ways to produce the same Uri shape today (Uri::builder().authority("host:port") or "host:port".parse::<Uri>(), both of which put the bytes in the authority slot), but hyper has been using path_and_query since 1.0, and downstream code depends on that behavior whether the docs sanction it or not. Other crates may have followed the same pattern; hyper is just the most visible example.

The empty-string case is harder to defend on principle. RFC 3986's path-abempty production explicitly permits an empty path component after an authority, so PathAndQuery::try_from("") returning Ok is consistent with the spec.

There's also a deeper design tension worth flagging. Uri::builder().path_and_query("localhost").build() and "localhost".parse::<Uri>() produce different Uri values today: the builder stores "localhost" in the path, the parser stores it in the authority. That ambiguity is fair grounds for not allowing authority-shaped inputs in path_and_query at all, which is what #826 leans toward. But the implementation has accepted them since 1.0 and hyper has built on that, so tightening PathAndQuery breaks the ecosystem without resolving the ambiguity itself.

It's also worth flagging that this is a runtime behavior change for previously-valid inputs, shipped in a patch release. Strictly under SemVer, rejecting inputs that previously succeeded is a breaking change and warrants a major version bump. That is also what would actually have protected downstreams: anyone pinning http = "1" (the default caret requirement) picks up any 1.x release automatically, so even a minor bump would have propagated the same breakage. Only a 2.0 keeps existing pins safe.

Suggested fix

Short-term: yank 1.4.1, or ship 1.4.2 that reverts #826.

Longer term, hyper should be using Uri::builder().authority("host:port") for CONNECT instead of path_and_query. That API already exists and does the right thing. Once hyper migrates, path_and_query can be tightened in a 2.0, if semvar semantics is desired. Silently prepending / is tempting since Display already does it, but doing it at storage time changes .as_str() and .path(), which is its own backward-compat risk.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions