diff --git a/architecture/client.md b/architecture/client.md index be79a1a..73c446e 100644 --- a/architecture/client.md +++ b/architecture/client.md @@ -15,3 +15,18 @@ The async middleware surface uses the `Async*`/`async_*` prefix, aligning with h ## Streaming `AsyncClient.stream()` provides a context-manager API for chunked response bodies. It bypasses the middleware chain by design. + +## Proxy environment (`trust_env`) + +`httpware` wraps `httpx2.Client` / `httpx2.AsyncClient`, which default to `trust_env=True`. The `HTTP_PROXY`, `HTTPS_PROXY`, and `NO_PROXY` environment variables and `.netrc` credentials are therefore honored by default — no httpware behavior to configure. To opt out, supply an explicit httpx2 client: + +```python +Client(httpx2_client=httpx2.Client(trust_env=False)) +AsyncClient(httpx2_client=httpx2.AsyncClient(trust_env=False)) +``` + +## Bounded error bodies (`max_error_body_bytes`) + +Both `Client` and `AsyncClient` accept `max_error_body_bytes: int | None = None`. The default (`None`) is backward-compatible: error bodies are read without a size limit. + +When set, `stream()` raises `ResponseTooLargeError` on a 4xx/5xx response whose declared `Content-Length` header exceeds the cap — before the body is read. Responses without a declared `Content-Length` (chunked transfer) are still read unbounded: a hard mid-read cap would require httpx2 private API, which this project forbids. diff --git a/architecture/errors.md b/architecture/errors.md index 7bff69b..c516bb5 100644 --- a/architecture/errors.md +++ b/architecture/errors.md @@ -8,7 +8,7 @@ exc.response.status_code # 404 exc.response.request.url # URL of the failed request ``` -`__repr__` and the `str()` summary strip `user:pass@` userinfo from `response.request.url` to avoid leaking credentials in tracebacks. Query-string secrets are not stripped here. +`__repr__` and the `str()` summary redact URL userinfo (`user:pass@`) and mask the values of known-sensitive query and fragment parameters (e.g. `token`, `api_key`, `secret`) to avoid leaking credentials in tracebacks. The error-mapping table (what `httpx2` exception maps to which `httpware` exception) lives at the terminal in `src/httpware/client.py`. Status-keyed exceptions are looked up via the `STATUS_TO_EXCEPTION` table in `src/httpware/errors.py`. Unknown 4xx falls back to `ClientStatusError`; unknown 5xx falls back to `ServerStatusError`. @@ -16,4 +16,10 @@ The error-mapping table (what `httpx2` exception maps to which `httpware` except `DecodeError` covers the case where `response_model=` is set, the HTTP call itself succeeded, but the active `ResponseDecoder` raised. The wrap happens at the seam in `Client.send` / `AsyncClient.send` — `except Exception` translates any decoder-side failure into `DecodeError(response=..., model=..., original=...)` with `raise ... from exc` chaining. The `original` attribute exposes the underlying library exception (e.g., `pydantic.ValidationError`, `msgspec.ValidationError`); `__cause__` carries the same reference. -The "no `__init__` override" rule scopes only to `StatusError` subclasses. Non-status `ClientError` subclasses — `DecodeError`, `MissingDecoderError`, `BulkheadFullError`, `RetryBudgetExhaustedError`, `CircuitOpenError` — deliberately define `__init__` with keyword-only fields. +The "no `__init__` override" rule scopes only to `StatusError` subclasses. Non-status `ClientError` subclasses — `DecodeError`, `MissingDecoderError`, `BulkheadFullError`, `RetryBudgetExhaustedError`, `CircuitOpenError`, `ResponseTooLargeError` — deliberately define `__init__` with keyword-only fields. + +`ResponseTooLargeError` is raised from `stream()` when `max_error_body_bytes` is set and a 4xx/5xx response's declared `Content-Length` exceeds the cap. It is a non-status `ClientError`; it does not carry a `StatusError`-style positional `response` and is not in `STATUS_TO_EXCEPTION`. + +## Security: request headers are reachable via `exc.response.request` + +`StatusError` holds the raw `httpx2.Response`. Request headers — including `Authorization`, `Cookie`, and `Proxy-Authorization` — remain reachable at `exc.response.request.headers`. httpware masks URL userinfo and known-sensitive query/fragment values in messages and `repr`, but does **not** strip headers. Handler authors must redact before logging or serializing a caught error. diff --git a/planning/changes/active/2026-06-14.03-security-hardening/design.md b/planning/changes/active/2026-06-14.03-security-hardening/design.md new file mode 100644 index 0000000..409131a --- /dev/null +++ b/planning/changes/active/2026-06-14.03-security-hardening/design.md @@ -0,0 +1,191 @@ +--- +status: draft +date: 2026-06-14 +slug: security-hardening +supersedes: null +superseded_by: null +pr: null +outcome: null +--- + +# Design: Security hardening — URL secret redaction + bounded error-body reads + +## Summary + +Close the 2026-06-14 deep-audit **security cluster**: stop leaking +query-string secrets into logs, telemetry, and exception messages; give +callers an opt-in bound on the error-body that `stream()` pre-reads on +4xx/5xx; and document the `trust_env` proxy default and the request-header +reachability of `StatusError`. Report-only audit findings become a single +Full-lane bundle with three independent sections. One finding (a true hard cap +on non-streaming response bodies) is deferred because it needs a Seam-A +dispatch rework; it is recorded in `planning/deferred.md`. + +## Motivation + +From [`audits/2026-06-14-deep-audit.md`](../../../audits/2026-06-14-deep-audit.md): + +- **Secret leakage (3 findings, Low):** every resilience middleware emits + `"url": str(request.url)` into log records and OTel span events, and + `StatusError.__str__`/`__repr__` compose `str(request.url)` after stripping + only `user:pass@` userinfo. `str(request.url)` includes the query string, so + `?api_key=…` / `?access_token=…` tokens land in logs, telemetry, exception + messages, `add_note(...)` text, and Sentry reports. A third, structural item: + `StatusError` holds the raw `httpx2.Response`, so `exc.response.request.headers` + exposes `Authorization`/`Cookie` to any handler. +- **Unbounded error-body buffering (Medium/security):** `AsyncClient.stream()` + and `Client.stream()` call `response.aread()`/`response.read()` on any + 4xx/5xx so `exc.response.content` is populated — with no size limit. A 500 + with a 1 GB body buffers 1 GB even though the caller asked to stream. +- **`trust_env=True` (Nit):** httpware inherits httpx2's default, silently + honoring `HTTP_PROXY`/`HTTPS_PROXY`/`NO_PROXY`/`.netrc` — undocumented. + +## Non-goals + +- No hard cap on the **non-streaming** decode path. For a non-streaming + `send()`, httpx2 buffers the whole body before httpware reaches the decode + seam, so a true bound needs a streaming-with-capped-accumulator rework of the + Seam-A terminal — out of scope here, recorded in `planning/deferred.md`. +- No configurable / caller-extensible sensitive-key set. The built-in set is + fixed (decided in brainstorming) to keep the public surface small. +- No change to `trust_env` behavior — documentation only. +- No stripping of request headers from the stored `Response` — that would break + the `StatusError` single-`response` contract; handled by a doc callout. + +## Design + +### 1. URL secret redaction + +New internal module **`src/httpware/_internal/redaction.py`** owning all URL +sanitation. `_strip_userinfo` moves here from `errors.py` (its only current +home) and gains a query-masking sibling, exposed as one helper: + +```python +def redact_url(url: str) -> str: + """Return url safe for logs/telemetry/errors: strip user:pass@ userinfo + and mask the values of known-sensitive query parameters.""" +``` + +- **Sensitive-key set** (fixed, case-insensitive), as a module-level + `frozenset` constant `SENSITIVE_QUERY_KEYS`: + `api_key`, `apikey`, `access_token`, `refresh_token`, `token`, `secret`, + `client_secret`, `password`, `passwd`, `pwd`, `auth`, `authorization`, + `sig`, `signature`, `key`, `private_key`, `session`, `sessionid`, `x-api-key`. +- **Masking:** parse the query with `urllib.parse.parse_qsl(..., + keep_blank_values=True)`, replace the value of any sensitive key with the + literal `REDACTED` (key preserved), re-encode with `urlencode`, and + reassemble via `urlunsplit`. Non-sensitive params (`page`, `limit`) survive + verbatim for debugging. **Common-path guard:** if the query contains no + sensitive key, return the userinfo-stripped URL unchanged without re-encoding, + so the byte output is identical to today for the overwhelming majority of + URLs — only secret-bearing queries are rewritten. Userinfo stripping runs + first, reusing the existing IPv6-rewrap logic. +- **Edge cases:** no `@` / no `?` → only the relevant step runs; empty query → + unchanged; repeated keys → each masked; values containing `=` survive via + `parse_qsl`. A URL with no scheme is returned untouched (mirrors + `_strip_userinfo`'s current guard). + +**Consumers:** + +- `errors.py`: `_summary` and `__repr__` call `redaction.redact_url(...)` + instead of `_strip_userinfo(...)`. The module docstring (currently "Query- + string secrets are NOT stripped here.") is corrected to: userinfo and + known-sensitive query values are masked; full request headers remain + reachable via `response.request`. +- Resilience middleware: add a thin shared helper + `_observed_url(request) -> str` (returns `redact_url(str(request.url))`) in + the resilience package, and route **every** emit site through it — + `retry.py` (6 sites), `bulkhead.py` (2), `circuit_breaker.py` (1), + `timeout.py` (1). Routing through one helper (rather than inlining + `redact_url(...)` at each site) means a new emit site can't silently + reintroduce the leak. + +### 2. Bounded error-body read on `stream()` + +New public exception and opt-in knob: + +- **`ResponseTooLargeError(ClientError)`** in `errors.py`, keyword-only + `__init__` (`status_code: int`, `limit: int`, `content_length: int | None`), + following the non-`StatusError` `ClientError` convention (defines `__init__`, + carries a `__reduce__`). Exported from `httpware.__init__` and added to + `__all__`. +- New client param **`max_error_body_bytes: int | None = None`** on both + `AsyncClient.__init__` and `Client.__init__`, stored on the instance. + `None` preserves today's unbounded behavior — fully backward-compatible. + +In both `stream()` implementations, replace the unconditional pre-read: + +```python +if HTTPStatus.BAD_REQUEST <= response.status_code < 600: + if self._max_error_body_bytes is not None: + content_length = _parse_content_length(response.headers.get("content-length")) + if content_length is not None and content_length > self._max_error_body_bytes: + raise ResponseTooLargeError( + status_code=response.status_code, + limit=self._max_error_body_bytes, + content_length=content_length, + ) + await response.aread() # within cap, or no declared length, or no cap set + _raise_on_status_error(response) +``` + +**Honest scope of the bound:** this is a **Content-Length-gated refusal**, not +a universal byte cap. It hard-bounds the common case (a server that declares an +over-cap `Content-Length`) by refusing *before* reading. A hostile server that +sends a large **chunked** body with no `Content-Length` is **not** bounded +here, because truly stopping mid-read and still populating +`exc.response.content` would require writing httpx2's private `_content`, which +the repo's `no httpx2._` invariant forbids. This residual is documented in +`architecture/client.md` and `architecture/errors.md`. `_parse_content_length` +is a small local helper that returns `int` for a clean non-negative header or +`None` for missing/garbage (never raises). + +### 3. `trust_env` documentation + +Add a short subsection to `architecture/client.md`: httpware wraps an +`httpx2.Client`/`httpx2.AsyncClient`, which default to `trust_env=True`, so +`HTTP_PROXY`/`HTTPS_PROXY`/`NO_PROXY`/`.netrc` are honored by default. To opt +out, construct the client with an explicit transport, +`Client(httpx2_client=httpx2.Client(trust_env=False))`. No behavior change. + +## Out of scope + +- Non-streaming hard body cap (`planning/deferred.md`, revisit trigger: when + the Seam-A terminal is next reworked or a concrete large-response abuse is + reported). +- Header redaction inside `StatusError` (contract-breaking; doc callout only). + +## Testing + +- **`redact_url` unit tests** (`tests/test_redaction.py`, new): userinfo strip + (incl. IPv6 literal, host-only, port), each sensitive key masked + case-insensitively, non-sensitive params preserved, repeated keys, blank + values, no-query / no-scheme passthrough. +- **Leakage integration:** one resilience event (via `caplog`) and one + `StatusError` message assert that `?api_key=secret` becomes + `?api_key=REDACTED`; sync and async parity for the middleware assertion. +- **Bounded read:** with `max_error_body_bytes` set, a 4xx/5xx whose + `Content-Length` exceeds the cap raises `ResponseTooLargeError` and does + **not** read the body; within-cap (or no cap) still populates + `exc.response.content`; chunked/no-length still reads (documented residual). + Sync `Client.stream()` and async `AsyncClient.stream()` both covered. Pickle + round-trip for `ResponseTooLargeError` (matches the other `ClientError` + `__reduce__` tests). Public-API test updated for the new export. +- `just test` at 100% coverage; `just lint` clean (ruff + ty). + +## Risk + +- **Redaction allowlist gaps** (medium likelihood × low impact): a sensitive + param with an unlisted name still leaks. Accepted trade-off of the + known-key policy chosen in brainstorming; the set covers the common token + names and is one constant to extend later. +- **Over-masking breaks a debugging workflow** (low × low): a non-secret param + that happens to be named `key`/`token` gets masked. Acceptable — these names + are sensitive often enough to default to masking. +- **`redact_url` parsing divergence** (low × medium): `parse_qsl` + + `urlencode` can normalize encoding (e.g. `+` vs `%20`) and reorder nothing + but could alter exotic queries. Mitigation: only touch the query when a + sensitive key is present; otherwise return the userinfo-stripped URL + unchanged, so the common path is byte-identical to today. +- **`ResponseTooLargeError` surprises callers** (low × low): only fires when a + caller opts in via `max_error_body_bytes`; default `None` changes nothing. diff --git a/planning/changes/active/2026-06-14.03-security-hardening/plan.md b/planning/changes/active/2026-06-14.03-security-hardening/plan.md new file mode 100644 index 0000000..e8704b4 --- /dev/null +++ b/planning/changes/active/2026-06-14.03-security-hardening/plan.md @@ -0,0 +1,824 @@ +--- +status: draft +date: 2026-06-14 +slug: security-hardening +spec: security-hardening +pr: null +--- + +# security-hardening — implementation plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use +> superpowers:subagent-driven-development (recommended) or +> superpowers:executing-plans to implement this plan task-by-task. Steps +> use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Redact URL secrets across logs/telemetry/errors and add an opt-in +Content-Length-gated bound on the `stream()` error-body pre-read, closing the +2026-06-14 deep-audit security cluster. + +**Spec:** [`design.md`](./design.md) + +**Branch:** `feat/security-hardening` (already created) + +**Commit strategy:** Per-task commits, TDD (failing test → implement → green → +commit). + +**Conventions reminder:** Python 3.11+, no `from __future__ import +annotations`, no `print()`, no `httpx2._` private API, type suppressions are +`# ty: ignore[...]`. Run tests with coverage disabled during TDD via +`uv run pytest -o addopts="" -q`; the full gate is `just test` (100% +line coverage) + `just lint`. + +--- + +### Task 1: `redact_url` sanitizer module + +**Files:** +- Create: `src/httpware/_internal/redaction.py` +- Create: `tests/test_redaction.py` + +Owns all URL sanitation: strip `user:pass@` userinfo and mask the values of +known-sensitive query parameters. + +- [ ] **Step 1: Write the failing tests** in `tests/test_redaction.py`: + +```python +"""Unit tests for the URL redaction helper.""" + +import pytest + +from httpware._internal.redaction import redact_url + + +@pytest.mark.parametrize( + ("url", "expected"), + [ + # no-op cases (common-path guard: bytes unchanged) + ("https://example.test/path", "https://example.test/path"), + ("https://example.test/path?page=2&limit=10", "https://example.test/path?page=2&limit=10"), + ("not-a-url", "not-a-url"), + ("https://example.test", "https://example.test"), + # userinfo stripped + ("https://user:pass@example.test/p", "https://example.test/p"), + ("https://user:pass@example.test:8443/p", "https://example.test:8443/p"), + ("https://user:pass@[2001:db8::1]:8443/p", "https://[2001:db8::1]:8443/p"), + # sensitive query value masked, key + other params preserved + ("https://example.test/p?api_key=abc123", "https://example.test/p?api_key=REDACTED"), + ("https://example.test/p?page=2&access_token=xyz", "https://example.test/p?page=2&access_token=REDACTED"), + # case-insensitive key match + ("https://example.test/p?API_KEY=abc", "https://example.test/p?API_KEY=REDACTED"), + # userinfo AND query both handled + ("https://u:p@example.test/p?token=t", "https://example.test/p?token=REDACTED"), + ], +) +def test_redact_url(url: str, expected: str) -> None: + assert redact_url(url) == expected + + +def test_redact_url_masks_repeated_sensitive_keys() -> None: + result = redact_url("https://example.test/p?token=a&token=b&page=1") + assert "token=a" not in result + assert "token=b" not in result + assert result.count("token=REDACTED") == 2 # noqa: PLR2004 — two token params above + assert "page=1" in result + + +def test_redact_url_masks_blank_sensitive_value() -> None: + assert redact_url("https://example.test/p?secret=") == "https://example.test/p?secret=REDACTED" +``` + +- [ ] **Step 2: Run to verify failure** + + Run: `uv run pytest tests/test_redaction.py -o addopts="" -q` + Expected: FAIL — `ModuleNotFoundError: No module named 'httpware._internal.redaction'`. + +- [ ] **Step 3: Implement** `src/httpware/_internal/redaction.py`: + +```python +"""URL sanitation for logs, telemetry, and error messages. + +Strips ``user:pass@`` userinfo and masks the values of known-sensitive query +parameters so secrets embedded in URLs do not leak into observability output. +Shared by ``errors.py`` (StatusError messages) and the resilience middleware +(event attributes). +""" + +from urllib.parse import parse_qsl, urlencode, urlsplit, urlunsplit + + +SENSITIVE_QUERY_KEYS = frozenset( + { + "api_key", + "apikey", + "access_token", + "refresh_token", + "token", + "secret", + "client_secret", + "password", + "passwd", + "pwd", + "auth", + "authorization", + "sig", + "signature", + "key", + "private_key", + "session", + "sessionid", + "x-api-key", + } +) + +_REDACTED = "REDACTED" + + +def _strip_userinfo(url: str) -> str: + if "@" not in url or "://" not in url: + return url + parts = urlsplit(url) + if parts.username is None and parts.password is None: + return url + hostname = parts.hostname or "" + if ":" in hostname: # IPv6 literal — re-wrap in brackets + hostname = f"[{hostname}]" + netloc = hostname + if parts.port is not None: + netloc = f"{netloc}:{parts.port}" + return urlunsplit((parts.scheme, netloc, parts.path, parts.query, parts.fragment)) + + +def _mask_query(url: str) -> str: + parts = urlsplit(url) + if not parts.query: + return url + pairs = parse_qsl(parts.query, keep_blank_values=True) + if not any(key.lower() in SENSITIVE_QUERY_KEYS for key, _ in pairs): + return url # common-path guard: nothing sensitive, leave bytes untouched + masked = [(key, _REDACTED if key.lower() in SENSITIVE_QUERY_KEYS else value) for key, value in pairs] + return urlunsplit((parts.scheme, parts.netloc, parts.path, urlencode(masked), parts.fragment)) + + +def redact_url(url: str) -> str: + """Return ``url`` safe for logs/telemetry/errors. + + Userinfo is stripped and the values of known-sensitive query parameters are + replaced with ``REDACTED`` (keys preserved). URLs with no sensitive query + key are returned byte-identical to the userinfo-stripped input. + """ + return _mask_query(_strip_userinfo(url)) +``` + +- [ ] **Step 4: Run to verify pass** + + Run: `uv run pytest tests/test_redaction.py -o addopts="" -q` + Expected: PASS (all parametrized cases + the two extra tests). + +- [ ] **Step 5: Commit** + + ```bash + git add src/httpware/_internal/redaction.py tests/test_redaction.py + git commit -m "feat(redaction): URL sanitizer (strip userinfo + mask sensitive query keys) + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 2: Route `StatusError` messages through `redact_url` + +**Files:** +- Modify: `src/httpware/errors.py` +- Modify: `tests/test_errors.py` + +Replace the userinfo-only `_strip_userinfo` in `errors.py` with `redact_url`, +and delete the now-duplicated local helper. + +- [ ] **Step 1: Write the failing test** — append to `tests/test_errors.py`: + +```python +def test_status_error_message_masks_query_secret() -> None: + request = httpx2.Request("GET", "https://example.test/p?api_key=topsecret&page=2") + response = httpx2.Response(404, request=request) + exc = NotFoundError(response) + assert "topsecret" not in str(exc) + assert "api_key=REDACTED" in str(exc) + assert "page=2" in str(exc) + assert "topsecret" not in repr(exc) +``` + + (Confirm `httpx2` and `NotFoundError` are already imported at the top of + `tests/test_errors.py`; add `from httpware.errors import NotFoundError` / + `import httpx2` only if missing.) + +- [ ] **Step 2: Run to verify failure** + + Run: `uv run pytest tests/test_errors.py -k masks_query_secret -o addopts="" -q` + Expected: FAIL — `assert 'api_key=REDACTED' in '404 GET https://example.test/p?api_key=topsecret&page=2'`. + +- [ ] **Step 3: Implement** in `src/httpware/errors.py`: + + 3a. Replace the module docstring lines about stripping (currently): + +```python +"""Status-keyed exception hierarchy. + +Auto-raise rule lives at AsyncClient's internal terminal (see client.py). +Unknown 4xx falls back to ClientStatusError; unknown 5xx to ServerStatusError. +The fallback assumes 400 <= status < 600. + +__repr__ and the summary message strip user:pass@ userinfo from +response.request.url to avoid leaking credentials in tracebacks. +Query-string secrets are NOT stripped here. +""" +``` + + with: + +```python +"""Status-keyed exception hierarchy. + +Auto-raise rule lives at AsyncClient's internal terminal (see client.py). +Unknown 4xx falls back to ClientStatusError; unknown 5xx to ServerStatusError. +The fallback assumes 400 <= status < 600. + +__repr__ and the summary message run response.request.url through +_internal.redaction.redact_url, which strips user:pass@ userinfo and masks the +values of known-sensitive query parameters. NOTE: the full request headers +(Authorization, Cookie, ...) remain reachable via exc.response.request — handler +authors must redact those before logging. +""" +``` + + 3b. Replace the `urllib.parse` import line: + +```python +from urllib.parse import urlsplit, urlunsplit +``` + + with: + +```python +from httpware._internal.redaction import redact_url +``` + + (Place it with the other `httpware`/third-party imports per isort ordering; + `ruff --fix` will reorder if needed.) + + 3c. Delete the entire `_strip_userinfo` function (the `def _strip_userinfo(url: str) -> str:` block). + + 3d. In `_summary` and `__repr__`, change both occurrences of: + +```python + url = _strip_userinfo(str(self.response.request.url)) +``` + + to: + +```python + url = redact_url(str(self.response.request.url)) +``` + +- [ ] **Step 4: Run to verify pass** + + Run: `uv run pytest tests/test_errors.py -o addopts="" -q` + Expected: PASS (new test plus all existing `StatusError` tests — existing + userinfo-stripping assertions still hold because `redact_url` strips userinfo). + +- [ ] **Step 5: Commit** + + ```bash + git add src/httpware/errors.py tests/test_errors.py + git commit -m "fix(errors): mask query secrets in StatusError messages via redact_url + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 3: Route all middleware event URLs through a shared helper + +**Files:** +- Modify: `src/httpware/_internal/observability.py` +- Modify: `src/httpware/middleware/resilience/retry.py` (6 sites) +- Modify: `src/httpware/middleware/resilience/bulkhead.py` (2 sites) +- Modify: `src/httpware/middleware/resilience/circuit_breaker.py` (1 site) +- Modify: `src/httpware/middleware/resilience/timeout.py` (1 site) +- Modify: `tests/test_retry.py` (async leakage test) +- Modify: `tests/test_retry_sync.py` (sync parity) + +Add one `_observed_url(request)` helper and route every emit site through it, +so a future emit site can't silently reintroduce the leak. + +- [ ] **Step 1: Write the failing test** — append to `tests/test_retry.py` + (mirror the existing `caplog` event tests; the event attribute is reachable + as `record.url`): + +```python +async def test_retry_event_url_attribute_masks_query_secret(caplog: pytest.LogCaptureFixture) -> None: + """Resilience event `url` attributes must not leak query-string secrets.""" + sleeper = _SleepRecorder() + handler = _ResponseSequence([HTTPStatus.SERVICE_UNAVAILABLE] * 3) + client = _client(handler, retry=AsyncRetry(_sleep=sleeper, max_attempts=3, base_delay=0.001, max_delay=0.002)) + + with caplog.at_level(logging.WARNING, logger="httpware.retry"), pytest.raises(ServiceUnavailableError): + await client.get("https://example.test/x?api_key=topsecret") + + giving_up = [r for r in caplog.records if r.name == "httpware.retry" and r.message.startswith("retry gave up")] + assert len(giving_up) == 1 + assert "topsecret" not in giving_up[0].url # ty: ignore[unresolved-attribute] + assert "api_key=REDACTED" in giving_up[0].url # ty: ignore[unresolved-attribute] +``` + + (Use the same `_client`, `_SleepRecorder`, `_ResponseSequence` helpers the + surrounding tests use; do not invent new fixtures.) + +- [ ] **Step 2: Run to verify failure** + + Run: `uv run pytest tests/test_retry.py -k event_url_attribute_masks -o addopts="" -q` + Expected: FAIL — `assert 'topsecret' not in 'https://example.test/x?api_key=topsecret'`. + +- [ ] **Step 3a: Add the helper** to `src/httpware/_internal/observability.py`. + Add the import near the top (with the existing imports): + +```python +import httpx2 + +from httpware._internal.redaction import redact_url +``` + + and add this function (below `_emit_event`): + +```python +def _observed_url(request: httpx2.Request) -> str: + """Return the request URL safe for emission (userinfo + sensitive query masked).""" + return redact_url(str(request.url)) +``` + +- [ ] **Step 3b: Route every emit site.** In each of `retry.py`, `bulkhead.py`, + `circuit_breaker.py`, `timeout.py`: + + - Extend the existing import `from httpware._internal.observability import _emit_event` to: + +```python +from httpware._internal.observability import _emit_event, _observed_url +``` + + - Replace every attributes-dict line `"url": str(request.url),` with + `"url": _observed_url(request),`. Sites: `retry.py` lines ~136, 155, 171, + 274, 293, 309; `bulkhead.py` ~117, 173; `circuit_breaker.py` ~193; + `timeout.py` ~72. Grep to confirm none remain: + `grep -rn 'str(request.url)' src/httpware/middleware/` must return zero + lines after this step. + +- [ ] **Step 4a: Add the sync parity test** — append to `tests/test_retry_sync.py` + (sync analogue, using that file's sync helpers and a `with pytest.raises(...)` + around a sync call; no `await`): + +```python +def test_retry_event_url_attribute_masks_query_secret_sync(caplog: pytest.LogCaptureFixture) -> None: + """Sync resilience event `url` attributes must not leak query-string secrets.""" + sleeper = _SleepRecorder() + handler = _ResponseSequence([HTTPStatus.SERVICE_UNAVAILABLE] * 3) + client = _client(handler, retry=Retry(_sleep=sleeper, max_attempts=3, base_delay=0.001, max_delay=0.002)) + + with caplog.at_level(logging.WARNING, logger="httpware.retry"), pytest.raises(ServiceUnavailableError): + client.get("https://example.test/x?api_key=topsecret") + + giving_up = [r for r in caplog.records if r.name == "httpware.retry" and r.message.startswith("retry gave up")] + assert len(giving_up) == 1 + assert "topsecret" not in giving_up[0].url # ty: ignore[unresolved-attribute] + assert "api_key=REDACTED" in giving_up[0].url # ty: ignore[unresolved-attribute] +``` + + (Match the exact constructor/helper names used elsewhere in + `tests/test_retry_sync.py` — e.g. `Retry`, the sync `_client`, + `_SleepRecorder`, `_ResponseSequence`. Adjust names if that file's helpers + differ.) + +- [ ] **Step 4b: Run to verify pass** + + Run: `uv run pytest tests/test_retry.py tests/test_retry_sync.py -k masks_query_secret -o addopts="" -q` + Expected: PASS (async + sync). + +- [ ] **Step 5: Commit** + + ```bash + git add src/httpware/_internal/observability.py src/httpware/middleware/resilience/ tests/test_retry.py tests/test_retry_sync.py + git commit -m "fix(observability): mask query secrets in resilience event URLs + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 4: `ResponseTooLargeError` exception + +**Files:** +- Modify: `src/httpware/errors.py` +- Modify: `src/httpware/__init__.py` +- Modify: `tests/test_errors.py` +- Modify: `tests/test_public_api.py` + +New `ClientError` subclass (non-`StatusError`, so it defines `__init__` + +`__reduce__`, per the convention in CLAUDE.md). + +- [ ] **Step 1: Write the failing tests** — append to `tests/test_errors.py`: + +```python +def test_response_too_large_error_fields_and_message() -> None: + exc = ResponseTooLargeError(status_code=500, limit=1024, content_length=2048) + assert exc.status_code == 500 # noqa: PLR2004 — literal mirrors construction above + assert exc.limit == 1024 # noqa: PLR2004 — literal mirrors construction above + assert exc.content_length == 2048 # noqa: PLR2004 — literal mirrors construction above + assert "1024" in str(exc) + assert "2048" in str(exc) + + +def test_response_too_large_error_pickle_round_trip() -> None: + exc = ResponseTooLargeError(status_code=503, limit=10, content_length=None) + restored = pickle.loads(pickle.dumps(exc)) # noqa: S301 — round-tripping our own exception + assert isinstance(restored, ResponseTooLargeError) + assert restored.status_code == 503 # noqa: PLR2004 — literal mirrors construction above + assert restored.limit == 10 # noqa: PLR2004 — literal mirrors construction above + assert restored.content_length is None +``` + + Add `ResponseTooLargeError` to the `from httpware.errors import (...)` / + `from httpware import ...` block at the top of `tests/test_errors.py`, and + ensure `import pickle` is present at module top (the existing `__reduce__` + round-trip tests almost certainly import it already; add it if not — do + **not** import inside the function, ruff `PLC0415` forbids it). + +- [ ] **Step 2: Run to verify failure** + + Run: `uv run pytest tests/test_errors.py -k response_too_large -o addopts="" -q` + Expected: FAIL — `ImportError: cannot import name 'ResponseTooLargeError'`. + +- [ ] **Step 3a: Implement** in `src/httpware/errors.py` — add after the + `MissingDecoderError` class (end of file): + +```python +def _reconstruct_response_too_large( + cls: "type[ResponseTooLargeError]", + status_code: int, + limit: int, + content_length: int | None, +) -> "ResponseTooLargeError": + return cls(status_code=status_code, limit=limit, content_length=content_length) + + +class ResponseTooLargeError(ClientError): + """Raised when an error response body exceeds the client's max_error_body_bytes cap. + + Fires from stream() on a 4xx/5xx whose declared Content-Length exceeds the + configured cap, BEFORE the body is read — so the oversized body is never + buffered. Only raised when max_error_body_bytes is set (opt-in). + """ + + status_code: int + limit: int + content_length: int | None + + def __init__(self, *, status_code: int, limit: int, content_length: int | None) -> None: + self.status_code = status_code + self.limit = limit + self.content_length = content_length + super().__init__( + f"error response body too large: status={status_code} " + f"content_length={content_length} exceeds max_error_body_bytes={limit}" + ) + + def __reduce__(self) -> tuple[Any, ...]: + return ( + _reconstruct_response_too_large, + (type(self), self.status_code, self.limit, self.content_length), + ) +``` + +- [ ] **Step 3b: Export** in `src/httpware/__init__.py`: add + `ResponseTooLargeError` to the `from httpware.errors import (...)` block + (alphabetically, after `RateLimitedError`) and add `"ResponseTooLargeError",` + to `__all__` (alphabetically, after `"ResponseDecoder",`). + +- [ ] **Step 3c: Update the public-API test.** In `tests/test_public_api.py`, + add `"ResponseTooLargeError"` to whatever expected-symbol collection the test + asserts against (read the file; match its existing structure exactly). + +- [ ] **Step 4: Run to verify pass** + + Run: `uv run pytest tests/test_errors.py tests/test_public_api.py -o addopts="" -q` + Expected: PASS. + +- [ ] **Step 5: Commit** + + ```bash + git add src/httpware/errors.py src/httpware/__init__.py tests/test_errors.py tests/test_public_api.py + git commit -m "feat(errors): add ResponseTooLargeError (opt-in error-body cap) + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 5: Opt-in `max_error_body_bytes` + bounded `stream()` pre-read + +**Files:** +- Modify: `src/httpware/client.py` (both constructors, both `stream()`, new helper) +- Modify: `tests/test_client_stream.py` (async) +- Modify: `tests/test_client_stream_sync.py` (sync) + +- [ ] **Step 1: Write the failing tests.** + + 1a. Async — append to `tests/test_client_stream.py` (use that file's existing + `httpx2.MockTransport` construction style; the helper names below mirror the + conftest/fixture pattern — adjust to match the file): + +```python +async def test_stream_raises_response_too_large_when_over_cap() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: + return httpx2.Response(500, content=body) + + client = AsyncClient(httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_error_body_bytes=10) + with pytest.raises(ResponseTooLargeError) as caught: + async with client.stream("GET", "https://example.test/x"): + pass + assert caught.value.limit == 10 # noqa: PLR2004 — mirrors max_error_body_bytes above + assert caught.value.content_length == 200 # noqa: PLR2004 — len(body) above + await client.aclose() + + +async def test_stream_reads_error_body_when_under_cap() -> None: + body = b"nope" + + def handler(request: httpx2.Request) -> httpx2.Response: + return httpx2.Response(404, content=body) + + client = AsyncClient(httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_error_body_bytes=1000) + with pytest.raises(NotFoundError) as caught: + async with client.stream("GET", "https://example.test/x"): + pass + assert caught.value.response.content == body + await client.aclose() + + +async def test_stream_unbounded_by_default_reads_large_error_body() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: + return httpx2.Response(500, content=body) + + client = AsyncClient(httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler))) + with pytest.raises(InternalServerError) as caught: + async with client.stream("GET", "https://example.test/x"): + pass + assert caught.value.response.content == body + await client.aclose() +``` + + Ensure `ResponseTooLargeError`, `NotFoundError`, `InternalServerError`, and + `AsyncClient` are imported at the top of the test file (add any missing). + + 1b. Sync — append the sync analogues to `tests/test_client_stream_sync.py` + (use `Client`, `httpx2.Client`, `with client.stream(...)`, `client.close()`, + no `await`): `test_stream_raises_response_too_large_when_over_cap_sync`, + `test_stream_reads_error_body_when_under_cap_sync`, + `test_stream_unbounded_by_default_reads_large_error_body_sync`. + + 1c. `_parse_content_length` unit tests — append to + `tests/test_client_stream.py`: + +```python +@pytest.mark.parametrize( + ("raw", "expected"), + [(None, None), ("123", 123), ("abc", None), ("-5", None), ("0", 0)], +) +def test_parse_content_length(raw: str | None, expected: int | None) -> None: + assert _parse_content_length(raw) == expected +``` + + Add `from httpware.client import _parse_content_length` to the **top** imports + of `tests/test_client_stream.py` (module level — ruff `PLC0415` forbids a + function-body import). + +- [ ] **Step 2: Run to verify failure** + + Run: `uv run pytest tests/test_client_stream.py tests/test_client_stream_sync.py -k "too_large or under_cap or unbounded or parse_content_length" -o addopts="" -q` + Expected: FAIL — `TypeError: __init__() got an unexpected keyword argument 'max_error_body_bytes'` (and `ImportError` for `_parse_content_length`). + +- [ ] **Step 3a: Add the helper** to `src/httpware/client.py` (module level, + near `_build_default_decoders`): + +```python +def _parse_content_length(raw: str | None) -> int | None: + """Return a non-negative int Content-Length, or None for missing/garbage. Never raises.""" + if raw is None: + return None + try: + value = int(raw) + except ValueError: + return None + return value if value >= 0 else None +``` + +- [ ] **Step 3b: Import the exception.** Change client.py's errors import: + +```python +from httpware.errors import DecodeError, MissingDecoderError, TransportError +``` + + to: + +```python +from httpware.errors import DecodeError, MissingDecoderError, ResponseTooLargeError, TransportError +``` + +- [ ] **Step 3c: AsyncClient constructor.** Add the class annotation under the + other `_`-prefixed annotations in `class AsyncClient` (after `_dispatch: AsyncNext`): + +```python + _max_error_body_bytes: int | None +``` + + Add the parameter to `AsyncClient.__init__` (keyword-only, after + `middleware: Sequence[AsyncMiddleware] = (),`): + +```python + max_error_body_bytes: int | None = None, +``` + + And store it (after `self._dispatch = compose_async(...)`): + +```python + self._max_error_body_bytes = max_error_body_bytes +``` + +- [ ] **Step 3d: Client constructor.** Mirror 3c in `class Client`: add + `_max_error_body_bytes: int | None` annotation (after `_dispatch: Next`), add + the `max_error_body_bytes: int | None = None,` param after + `middleware: Sequence[Middleware] = (),`, and add + `self._max_error_body_bytes = max_error_body_bytes` after + `self._dispatch = compose(...)`. + +- [ ] **Step 3e: Bound the async stream pre-read.** Replace (around line 786): + +```python + async with _httpx2_exception_mapper(), self._httpx2_client.stream(method, url, **kwargs) as response: + if HTTPStatus.BAD_REQUEST <= response.status_code < 600: # noqa: PLR2004 — 600 is the synthetic upper bound for 5xx + await response.aread() # pre-read body so exc.response.content works + _raise_on_status_error(response) + yield response +``` + + with: + +```python + async with _httpx2_exception_mapper(), self._httpx2_client.stream(method, url, **kwargs) as response: + if HTTPStatus.BAD_REQUEST <= response.status_code < 600: # noqa: PLR2004 — 600 is the synthetic upper bound for 5xx + if self._max_error_body_bytes is not None: + content_length = _parse_content_length(response.headers.get("content-length")) + if content_length is not None and content_length > self._max_error_body_bytes: + raise ResponseTooLargeError( + status_code=response.status_code, + limit=self._max_error_body_bytes, + content_length=content_length, + ) + await response.aread() # pre-read body so exc.response.content works + _raise_on_status_error(response) + yield response +``` + +- [ ] **Step 3f: Bound the sync stream pre-read.** Replace (around line 1548) + the sync equivalent the same way, using `response.read()` (no `await`): + +```python + with _httpx2_exception_mapper_sync(), self._httpx2_client.stream(method, url, **kwargs) as response: + if HTTPStatus.BAD_REQUEST <= response.status_code < 600: # noqa: PLR2004 — 600 is the synthetic upper bound for 5xx + if self._max_error_body_bytes is not None: + content_length = _parse_content_length(response.headers.get("content-length")) + if content_length is not None and content_length > self._max_error_body_bytes: + raise ResponseTooLargeError( + status_code=response.status_code, + limit=self._max_error_body_bytes, + content_length=content_length, + ) + response.read() # pre-read body so exc.response.content works + _raise_on_status_error(response) + yield response +``` + +- [ ] **Step 4: Run to verify pass** + + Run: `uv run pytest tests/test_client_stream.py tests/test_client_stream_sync.py -o addopts="" -q` + Expected: PASS. + +- [ ] **Step 5: Commit** + + ```bash + git add src/httpware/client.py tests/test_client_stream.py tests/test_client_stream_sync.py + git commit -m "feat(client): opt-in max_error_body_bytes bounds the stream() error pre-read + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 6: Documentation — architecture + deferred item + +**Files:** +- Modify: `architecture/client.md` +- Modify: `architecture/errors.md` +- Modify: `planning/deferred.md` + +Docs only; no code. (Read each file first and match its prose style.) + +- [ ] **Step 1: `architecture/client.md`** — add two short subsections: + + - **Proxy environment (`trust_env`):** "httpware wraps an + `httpx2.Client`/`httpx2.AsyncClient`, which default to `trust_env=True`: + `HTTP_PROXY`/`HTTPS_PROXY`/`NO_PROXY` and `.netrc` are honored by default. + To opt out, pass an explicit client: + `Client(httpx2_client=httpx2.Client(trust_env=False))`." + - **Bounded error bodies:** document `max_error_body_bytes` (default `None`, + opt-in), that it raises `ResponseTooLargeError` from `stream()` when a + 4xx/5xx **declares** a `Content-Length` over the cap before reading, and the + residual: a chunked error body with no declared length is still read, + because a hard mid-read cap would require httpx2's private `_content`. + +- [ ] **Step 2: `architecture/errors.md`** — add a callout: `StatusError` holds + the raw `httpx2.Response`, so secrets in **request headers** + (`Authorization`, `Cookie`, `Proxy-Authorization`) remain reachable via + `exc.response.request.headers`; httpware masks URL userinfo and known-sensitive + query values in messages/`repr`, but does not strip headers — handler authors + must redact before logging or serializing a caught error. Mention + `ResponseTooLargeError` in the error list/tree if the file enumerates the + hierarchy. + +- [ ] **Step 3: `planning/deferred.md`** — add an entry: "Non-streaming hard + response-body cap — for non-streaming `send()`, httpx2 buffers the whole body + before the decode seam, so a true cap needs a streaming-with-capped-accumulator + rework of the Seam-A terminal. Revisit trigger: the Seam-A terminal is next + reworked, or a concrete large-response abuse is reported. Source: 2026-06-14 + deep audit (Medium)." Match the file's existing entry format. + +- [ ] **Step 4: Commit** + + ```bash + git add architecture/client.md architecture/errors.md planning/deferred.md + git commit -m "docs: trust_env + bounded-error-body + header-reachability callouts + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 7: Full verification + +**Files:** none (gate only). + +- [ ] **Step 1: Lint** + + Run: `just lint` + Expected: eof-fixer + ruff format + ruff check + ty all clean. (If ruff + reordered imports in `errors.py`/`observability.py`/`client.py`, that's + expected; re-stage and amend the relevant task commit or add a fixup commit.) + +- [ ] **Step 2: Full suite + coverage** + + Run: `just test` + Expected: all tests pass, **coverage 100%**. If `redaction.py` or the new + error class shows a missing line, add the covering assertion (do NOT add + `# pragma: no cover`). + +- [ ] **Step 3: Grep guards** + + Run: + ```bash + grep -rn 'str(request.url)' src/httpware/middleware/ ; echo "[expect none]" + grep -rn 'httpx2\._' src/httpware/ ; echo "[expect none]" + ``` + Expected: both empty. + +- [ ] **Step 4: Final report** — summarize bucket of findings closed (3 leakage + Lows folded + the streaming Medium bounded + trust_env Nit documented), and + that the non-streaming hard cap was deferred. + +--- + +## Notes for the executor + +- **TDD discipline:** every code task starts with a failing test and the exact + failure message is given — confirm you see it before implementing. +- **Sync/async parity:** Tasks 3 and 5 touch both surfaces; never land one + without its sibling test. +- **No `# pragma: no cover`** — `just test` enforces 100% line coverage; the + plan's tests are designed to reach every new line (`_parse_content_length` + branches via its direct unit test; the redaction common-path guard via the + no-op parametrized cases). +- **Helper-name reality check:** the test snippets assume helper names + (`_client`, `_SleepRecorder`, `_ResponseSequence`, `_SleepRecorder`) from the + existing test files. Open each target test file first and match its actual + fixtures; adjust the snippet names if they differ, keeping the assertions. diff --git a/planning/deferred.md b/planning/deferred.md index 54e09b1..7ec6272 100644 --- a/planning/deferred.md +++ b/planning/deferred.md @@ -37,3 +37,5 @@ As of 0.7.0, all planned epics (3, 4, 5, 6) are closed — see the [change Index ### Documentation - **Custom-`ResponseDecoder` guide** (audit finding G6, [2026-06-13 docs audit](audits/2026-06-13-docs-audit.md)) — the decoder seam (Seam B) is a documented extension point, but unlike middleware it has no "write your own" guide. A short page would show the `can_decode(model: type) -> bool` / `decode(content: bytes, model: type[T]) -> T` protocol, how `decoders=[...]` ordering resolves a model, and a worked third-party-adapter example. Decided alongside the `2026-06-14.01` docs-UX restructure: **defer the guide, and ship no auto API reference / mkdocstrings** (prose carries the signatures). Demand-gated. Revisit trigger: someone asks how to write a custom decoder, a third-party decoder adapter ships, or the `decoders/` protocol surface changes. (`docs/`, `src/httpware/decoders/`) + +- **Non-streaming hard response-body cap** (2026-06-14 deep audit, Medium) — for a non-streaming `send()`, httpx2 buffers the whole body before httpware reaches the decode seam, so a true cap needs a streaming-with-capped-accumulator rework of the Seam-A terminal. The current `max_error_body_bytes` guard only applies at `stream()` entry and only when `Content-Length` is declared. Revisit trigger: the Seam-A terminal is next reworked, or a concrete large-response abuse is reported. (`src/httpware/client.py`) diff --git a/pyproject.toml b/pyproject.toml index be65b8f..d0301ca 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -97,4 +97,4 @@ asyncio_default_fixture_loop_scope = "function" [tool.coverage] run.concurrency = ["thread"] -report.exclude_also = ["if typing.TYPE_CHECKING:"] +report.exclude_also = ["if typing.TYPE_CHECKING:", 'pytest\.fail\('] diff --git a/src/httpware/__init__.py b/src/httpware/__init__.py index 746e057..69cfa61 100644 --- a/src/httpware/__init__.py +++ b/src/httpware/__init__.py @@ -17,6 +17,7 @@ NetworkError, NotFoundError, RateLimitedError, + ResponseTooLargeError, RetryBudgetExhaustedError, ServerStatusError, ServiceUnavailableError, @@ -78,6 +79,7 @@ "NotFoundError", "RateLimitedError", "ResponseDecoder", + "ResponseTooLargeError", "Retry", "RetryBudget", "RetryBudgetExhaustedError", diff --git a/src/httpware/_internal/observability.py b/src/httpware/_internal/observability.py index 310be57..726338b 100644 --- a/src/httpware/_internal/observability.py +++ b/src/httpware/_internal/observability.py @@ -13,6 +13,7 @@ import typing from httpware._internal import import_checker +from httpware._internal.redaction import redact_url def _emit_event( @@ -25,6 +26,12 @@ def _emit_event( ) -> None: """Emit one observability event to both channels. + The ``url`` attribute, when present, is run through + ``redaction.redact_url`` here — at the single emission boundary — so a + request URL's userinfo and known-sensitive query/fragment secrets never + reach a log record or span event, regardless of how a caller built the + attributes dict. + 1. Always emits a structured log record at ``level`` with ``extra=attributes`` (so log aggregators that index ``extra`` see structured fields). 2. If ``opentelemetry-api`` is installed, calls @@ -39,7 +46,9 @@ def _emit_event( the optional-extras isolation invariant: ``import httpware`` must not pull ``opentelemetry`` into ``sys.modules`` when the extra is absent. """ - logger.log(level, message, extra={**attributes, "event": event_name}) + raw_url = attributes.get("url") + safe_attributes = {**attributes, "url": redact_url(raw_url)} if isinstance(raw_url, str) else attributes + logger.log(level, message, extra={**safe_attributes, "event": event_name}) if import_checker.is_otel_installed: try: from opentelemetry import trace # noqa: PLC0415 — lazy by design (optional-extras isolation) @@ -52,4 +61,4 @@ def _emit_event( # The structured log record above has already fired; CancelledError/KeyboardInterrupt # are not Exception subclasses and will still propagate. with contextlib.suppress(Exception): - trace.get_current_span().add_event(event_name, attributes=attributes) + trace.get_current_span().add_event(event_name, attributes=safe_attributes) diff --git a/src/httpware/_internal/redaction.py b/src/httpware/_internal/redaction.py new file mode 100644 index 0000000..8b4809c --- /dev/null +++ b/src/httpware/_internal/redaction.py @@ -0,0 +1,100 @@ +"""URL sanitation for logs, telemetry, and error messages. + +Strips ``user:pass@`` userinfo and masks the values of known-sensitive query +parameters so secrets embedded in URLs do not leak into observability output. +Shared by ``errors.py`` (StatusError messages) and the resilience middleware +(event attributes). +""" + +from urllib.parse import parse_qsl, urlencode, urlsplit, urlunsplit + + +SENSITIVE_QUERY_KEYS = frozenset( + { + "api_key", + "apikey", + "access_token", + "refresh_token", + "token", + "secret", + "client_secret", + "password", + "passwd", + "pwd", + "auth", + "authorization", + "sig", + "signature", + "key", + "private_key", + "session", + "sessionid", + "x-api-key", + } +) + +_REDACTED = "REDACTED" + + +def _strip_userinfo(url: str) -> str: + if "@" not in url or "://" not in url: + return url + parts = urlsplit(url) + if parts.username is None and parts.password is None: + return url + hostname = parts.hostname or "" + if ":" in hostname: # IPv6 literal — re-wrap in brackets + hostname = f"[{hostname}]" + netloc = hostname + if parts.port is not None: + netloc = f"{netloc}:{parts.port}" + return urlunsplit((parts.scheme, netloc, parts.path, parts.query, parts.fragment)) + + +def _mask_component(component: str) -> tuple[str, bool]: + """Mask sensitive key=value pairs in a query or fragment string. + + Returns ``(masked_component, was_changed)``; when no sensitive key is + found the original string is returned unchanged (``was_changed=False``). + """ + pairs = parse_qsl(component, keep_blank_values=True) + if not any(key.strip().lower() in SENSITIVE_QUERY_KEYS for key, _ in pairs): + return component, False + masked = [(key, _REDACTED if key.strip().lower() in SENSITIVE_QUERY_KEYS else value) for key, value in pairs] + return urlencode(masked), True + + +def _mask_query(url: str) -> str: + parts = urlsplit(url) + has_query = bool(parts.query) + has_fragment = bool(parts.fragment) + + if not has_query and not has_fragment: + return url + + new_query = parts.query + new_fragment = parts.fragment + changed = False + + if has_query: + new_query, q_changed = _mask_component(parts.query) + changed = changed or q_changed + + if has_fragment: + new_fragment, f_changed = _mask_component(parts.fragment) + changed = changed or f_changed + + if not changed: + return url # common-path guard: nothing sensitive, leave bytes untouched + + return urlunsplit((parts.scheme, parts.netloc, parts.path, new_query, new_fragment)) + + +def redact_url(url: str) -> str: + """Return ``url`` safe for logs/telemetry/errors. + + Userinfo is stripped and the values of known-sensitive query parameters are + replaced with ``REDACTED`` (keys preserved). URLs with no sensitive query + key are returned byte-identical to the userinfo-stripped input. + """ + return _mask_query(_strip_userinfo(url)) diff --git a/src/httpware/client.py b/src/httpware/client.py index 30ed4f4..cf33ad0 100644 --- a/src/httpware/client.py +++ b/src/httpware/client.py @@ -16,7 +16,7 @@ _raise_on_status_error, ) from httpware.decoders import ResponseDecoder -from httpware.errors import DecodeError, MissingDecoderError, TransportError +from httpware.errors import DecodeError, MissingDecoderError, ResponseTooLargeError, TransportError from httpware.middleware import AsyncMiddleware, AsyncNext, Middleware, Next from httpware.middleware.chain import compose, compose_async @@ -31,6 +31,17 @@ ) +def _parse_content_length(raw: str | None) -> int | None: + """Return a non-negative int Content-Length, or None for missing/garbage. Never raises.""" + if raw is None: + return None + try: + value = int(raw) + except ValueError: + return None + return value if value >= 0 else None + + def _build_default_decoders() -> tuple[ResponseDecoder, ...]: """Construct the default decoder tuple based on installed extras. @@ -82,6 +93,7 @@ class AsyncClient: _decoders: tuple[ResponseDecoder, ...] _user_middleware: tuple[AsyncMiddleware, ...] _dispatch: AsyncNext + _max_error_body_bytes: int | None def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call API self, @@ -96,6 +108,7 @@ def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call httpx2_client: httpx2.AsyncClient | None = None, decoders: Sequence[ResponseDecoder] | None = None, middleware: Sequence[AsyncMiddleware] = (), + max_error_body_bytes: int | None = None, ) -> None: if httpx2_client is not None: forwarded = { @@ -133,6 +146,7 @@ def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call self._decoders = tuple(decoders) if decoders is not None else _build_default_decoders() self._user_middleware = tuple(middleware) self._dispatch = compose_async(self._user_middleware, self._terminal) + self._max_error_body_bytes = max_error_body_bytes def _dispatch_decoder(self, model: type) -> ResponseDecoder | None: """Walk `_decoders` and return the first decoder claiming `model`, or None.""" @@ -785,6 +799,14 @@ async def stream( # noqa: PLR0913, C901 — mirrors httpx2 per-method signature async with _httpx2_exception_mapper(), self._httpx2_client.stream(method, url, **kwargs) as response: if HTTPStatus.BAD_REQUEST <= response.status_code < 600: # noqa: PLR2004 — 600 is the synthetic upper bound for 5xx + if self._max_error_body_bytes is not None: + content_length = _parse_content_length(response.headers.get("content-length")) + if content_length is not None and content_length > self._max_error_body_bytes: + raise ResponseTooLargeError( + status_code=response.status_code, + limit=self._max_error_body_bytes, + content_length=content_length, + ) await response.aread() # pre-read body so exc.response.content works _raise_on_status_error(response) yield response @@ -822,6 +844,7 @@ class Client: _decoders: tuple[ResponseDecoder, ...] _user_middleware: tuple[Middleware, ...] _dispatch: Next + _max_error_body_bytes: int | None def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call API self, @@ -836,6 +859,7 @@ def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call httpx2_client: httpx2.Client | None = None, decoders: Sequence[ResponseDecoder] | None = None, middleware: Sequence[Middleware] = (), + max_error_body_bytes: int | None = None, ) -> None: if httpx2_client is not None: forwarded = { @@ -873,6 +897,7 @@ def __init__( # noqa: PLR0913 — wide constructor is the cost of a single-call self._decoders = tuple(decoders) if decoders is not None else _build_default_decoders() self._user_middleware = tuple(middleware) self._dispatch = compose(self._user_middleware, self._terminal) + self._max_error_body_bytes = max_error_body_bytes def _dispatch_decoder(self, model: type) -> ResponseDecoder | None: """Walk `_decoders` and return the first decoder claiming `model`, or None.""" @@ -1547,6 +1572,14 @@ def stream( # noqa: PLR0913, C901 — mirrors httpx2 per-method signatures; kwa with _httpx2_exception_mapper_sync(), self._httpx2_client.stream(method, url, **kwargs) as response: if HTTPStatus.BAD_REQUEST <= response.status_code < 600: # noqa: PLR2004 — 600 is the synthetic upper bound for 5xx + if self._max_error_body_bytes is not None: + content_length = _parse_content_length(response.headers.get("content-length")) + if content_length is not None and content_length > self._max_error_body_bytes: + raise ResponseTooLargeError( + status_code=response.status_code, + limit=self._max_error_body_bytes, + content_length=content_length, + ) response.read() # pre-read body so exc.response.content works _raise_on_status_error(response) yield response diff --git a/src/httpware/errors.py b/src/httpware/errors.py index 8502f2b..05d44e6 100644 --- a/src/httpware/errors.py +++ b/src/httpware/errors.py @@ -4,32 +4,20 @@ Unknown 4xx falls back to ClientStatusError; unknown 5xx to ServerStatusError. The fallback assumes 400 <= status < 600. -__repr__ and the summary message strip user:pass@ userinfo from -response.request.url to avoid leaking credentials in tracebacks. -Query-string secrets are NOT stripped here. +__repr__ and the summary message run response.request.url through +_internal.redaction.redact_url, which strips user:pass@ userinfo and masks the +values of known-sensitive query parameters. NOTE: the full request headers +(Authorization, Cookie, ...) remain reachable via exc.response.request — handler +authors must redact those before logging. """ import builtins from collections.abc import Mapping from typing import Any -from urllib.parse import urlsplit, urlunsplit import httpx2 - -def _strip_userinfo(url: str) -> str: - if "@" not in url or "://" not in url: - return url - parts = urlsplit(url) - if parts.username is None and parts.password is None: - return url - hostname = parts.hostname or "" - if ":" in hostname: # IPv6 literal — re-wrap in brackets - hostname = f"[{hostname}]" - netloc = hostname - if parts.port is not None: - netloc = f"{netloc}:{parts.port}" - return urlunsplit((parts.scheme, netloc, parts.path, parts.query, parts.fragment)) +from httpware._internal.redaction import redact_url class ClientError(Exception): @@ -76,13 +64,13 @@ def __init__(self, response: httpx2.Response) -> None: def _summary(self) -> str: method = self.response.request.method - url = _strip_userinfo(str(self.response.request.url)) + url = redact_url(str(self.response.request.url)) return f"{self.response.status_code} {method} {url}" def __repr__(self) -> str: cls_name = type(self).__name__ method = self.response.request.method - url = _strip_userinfo(str(self.response.request.url)) + url = redact_url(str(self.response.request.url)) return f"<{cls_name} status={self.response.status_code} method={method} url={url}>" def __reduce__(self) -> tuple[Any, ...]: @@ -323,3 +311,40 @@ def __init__(self, *, model: type, registered_names: tuple[str, ...]) -> None: def __reduce__(self) -> tuple[Any, ...]: return (_reconstruct_missing_decoder, (type(self), self.model, self.registered_names)) + + +def _reconstruct_response_too_large( + cls: "type[ResponseTooLargeError]", + status_code: int, + limit: int, + content_length: int | None, +) -> "ResponseTooLargeError": + return cls(status_code=status_code, limit=limit, content_length=content_length) + + +class ResponseTooLargeError(ClientError): + """Raised when an error response body exceeds the client's max_error_body_bytes cap. + + Fires from stream() on a 4xx/5xx whose declared Content-Length exceeds the + configured cap, BEFORE the body is read — so the oversized body is never + buffered. Only raised when max_error_body_bytes is set (opt-in). + """ + + status_code: int + limit: int + content_length: int | None + + def __init__(self, *, status_code: int, limit: int, content_length: int | None) -> None: + self.status_code = status_code + self.limit = limit + self.content_length = content_length + super().__init__( + f"error response body too large: status={status_code} " + f"content_length={content_length} exceeds max_error_body_bytes={limit}" + ) + + def __reduce__(self) -> tuple[Any, ...]: + return ( + _reconstruct_response_too_large, + (type(self), self.status_code, self.limit, self.content_length), + ) diff --git a/tests/test_client_stream.py b/tests/test_client_stream.py index 383d6e0..db71aa2 100644 --- a/tests/test_client_stream.py +++ b/tests/test_client_stream.py @@ -10,8 +10,10 @@ from httpware import ( AsyncClient, ClientStatusError, + InternalServerError, NetworkError, NotFoundError, + ResponseTooLargeError, ServerStatusError, ServiceUnavailableError, TransportError, @@ -19,6 +21,7 @@ from httpware import ( TimeoutError as HttpwareTimeoutError, ) +from httpware.client import _parse_content_length from httpware.middleware import AsyncMiddleware, AsyncNext @@ -335,3 +338,58 @@ def handler(request: httpx2.Request) -> httpx2.Response: ) as response: _ = [chunk async for chunk in response.aiter_bytes()] assert "multipart/form-data" in seen[0].headers["content-type"] + + +async def test_stream_raises_response_too_large_when_over_cap() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, content=body) + + client = AsyncClient( + httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_error_body_bytes=10 + ) + with pytest.raises(ResponseTooLargeError) as caught: + async with client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.limit == 10 # noqa: PLR2004 — mirrors max_error_body_bytes above + assert caught.value.content_length == 200 # noqa: PLR2004 — len(body) above + await client.aclose() + + +async def test_stream_reads_error_body_when_under_cap() -> None: + body = b"nope" + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(404, content=body) + + client = AsyncClient( + httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)), max_error_body_bytes=1000 + ) + with pytest.raises(NotFoundError) as caught: + async with client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.response.content == body + await client.aclose() + + +async def test_stream_unbounded_by_default_reads_large_error_body() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, content=body) + + client = AsyncClient(httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler))) + with pytest.raises(InternalServerError) as caught: + async with client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.response.content == body + await client.aclose() + + +@pytest.mark.parametrize( + ("raw", "expected"), + [(None, None), ("123", 123), ("abc", None), ("-5", None), ("0", 0)], +) +def test_parse_content_length(raw: str | None, expected: int | None) -> None: + assert _parse_content_length(raw) == expected diff --git a/tests/test_client_stream_sync.py b/tests/test_client_stream_sync.py index 85319a5..b9181dc 100644 --- a/tests/test_client_stream_sync.py +++ b/tests/test_client_stream_sync.py @@ -9,8 +9,10 @@ from httpware import ( Client, ClientStatusError, + InternalServerError, NetworkError, NotFoundError, + ResponseTooLargeError, ServerStatusError, ServiceUnavailableError, TransportError, @@ -304,3 +306,43 @@ def handler(request: httpx2.Request) -> httpx2.Response: ) as response: _ = list(response.iter_bytes()) assert "multipart/form-data" in seen[0].headers["content-type"] + + +def test_stream_raises_response_too_large_when_over_cap_sync() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, content=body) + + client = Client(httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), max_error_body_bytes=10) + with pytest.raises(ResponseTooLargeError) as caught, client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.limit == 10 # noqa: PLR2004 — mirrors max_error_body_bytes above + assert caught.value.content_length == 200 # noqa: PLR2004 — len(body) above + client.close() + + +def test_stream_reads_error_body_when_under_cap_sync() -> None: + body = b"nope" + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(404, content=body) + + client = Client(httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler)), max_error_body_bytes=1000) + with pytest.raises(NotFoundError) as caught, client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.response.content == body + client.close() + + +def test_stream_unbounded_by_default_reads_large_error_body_sync() -> None: + body = b"x" * 200 + + def handler(request: httpx2.Request) -> httpx2.Response: # noqa: ARG001 + return httpx2.Response(500, content=body) + + client = Client(httpx2_client=httpx2.Client(transport=httpx2.MockTransport(handler))) + with pytest.raises(InternalServerError) as caught, client.stream("GET", "https://example.test/x"): + pytest.fail("unreachable") + assert caught.value.response.content == body + client.close() diff --git a/tests/test_errors.py b/tests/test_errors.py index 2e706d5..66e5c3b 100644 --- a/tests/test_errors.py +++ b/tests/test_errors.py @@ -23,6 +23,7 @@ NetworkError, NotFoundError, RateLimitedError, + ResponseTooLargeError, RetryBudgetExhaustedError, ServerStatusError, ServiceUnavailableError, @@ -394,3 +395,31 @@ def test_circuit_open_error_pickleable_with_none() -> None: restored = pickle.loads(pickle.dumps(exc)) # noqa: S301 assert isinstance(restored, CircuitOpenError) assert restored.retry_after is None + + +def test_status_error_message_masks_query_secret() -> None: + request = httpx2.Request("GET", "https://example.test/p?api_key=topsecret&page=2") + response = httpx2.Response(404, request=request) + exc = NotFoundError(response) + assert "topsecret" not in str(exc) + assert "api_key=REDACTED" in str(exc) + assert "page=2" in str(exc) + assert "topsecret" not in repr(exc) + + +def test_response_too_large_error_fields_and_message() -> None: + exc = ResponseTooLargeError(status_code=500, limit=1024, content_length=2048) + assert exc.status_code == 500 # noqa: PLR2004 — literal mirrors construction above + assert exc.limit == 1024 # noqa: PLR2004 — literal mirrors construction above + assert exc.content_length == 2048 # noqa: PLR2004 — literal mirrors construction above + assert "1024" in str(exc) + assert "2048" in str(exc) + + +def test_response_too_large_error_pickle_round_trip() -> None: + exc = ResponseTooLargeError(status_code=503, limit=10, content_length=None) + restored = pickle.loads(pickle.dumps(exc)) # noqa: S301 — round-tripping our own exception + assert isinstance(restored, ResponseTooLargeError) + assert restored.status_code == 503 # noqa: PLR2004 — literal mirrors construction above + assert restored.limit == 10 # noqa: PLR2004 — literal mirrors construction above + assert restored.content_length is None diff --git a/tests/test_observability.py b/tests/test_observability.py index c176ef7..e3c5bd5 100644 --- a/tests/test_observability.py +++ b/tests/test_observability.py @@ -32,6 +32,43 @@ def test_emit_event_logs_at_warning_with_extra_fields(caplog: pytest.LogCaptureF assert record.event == "test.event" # ty: ignore[unresolved-attribute] +def test_emit_event_redacts_url_secret_in_log_record(caplog: pytest.LogCaptureFixture) -> None: + """The `url` attribute is redacted at the emission boundary, before the log record fires.""" + with caplog.at_level(logging.WARNING, logger="httpware.test.observability"): + _emit_event( + _TEST_LOGGER, + "test.event", + level=logging.WARNING, + message="leaky", + attributes={"url": "https://u:p@example.test/x?api_key=topsecret"}, + ) + + record = caplog.records[0] + assert "topsecret" not in record.url # ty: ignore[unresolved-attribute] + assert "api_key=REDACTED" in record.url # ty: ignore[unresolved-attribute] + assert "u:p@" not in record.url # ty: ignore[unresolved-attribute] + + +def test_emit_event_redacts_url_secret_in_otel_event() -> None: + """The OTel span event receives the redacted `url`, not the raw secret.""" + mock_span = MagicMock(name="MockSpan") + with ( + patch("httpware._internal.import_checker.is_otel_installed", True), + patch("opentelemetry.trace.get_current_span", return_value=mock_span), + ): + _emit_event( + _TEST_LOGGER, + "test.event", + level=logging.WARNING, + message="leaky", + attributes={"url": "https://example.test/x?token=topsecret"}, + ) + + _, kwargs = mock_span.add_event.call_args + assert "topsecret" not in kwargs["attributes"]["url"] + assert "token=REDACTED" in kwargs["attributes"]["url"] + + def test_emit_event_respects_level_parameter(caplog: pytest.LogCaptureFixture) -> None: """When level=DEBUG is passed, the record is at DEBUG.""" with caplog.at_level(logging.DEBUG, logger="httpware.test.observability"): diff --git a/tests/test_public_api.py b/tests/test_public_api.py index 97a8b9a..c1b8f2e 100644 --- a/tests/test_public_api.py +++ b/tests/test_public_api.py @@ -54,6 +54,7 @@ def test_expected_exports() -> None: "NotFoundError", "RateLimitedError", "ResponseDecoder", + "ResponseTooLargeError", "Retry", "RetryBudget", "RetryBudgetExhaustedError", diff --git a/tests/test_redaction.py b/tests/test_redaction.py new file mode 100644 index 0000000..4d250c5 --- /dev/null +++ b/tests/test_redaction.py @@ -0,0 +1,60 @@ +"""Unit tests for the URL redaction helper.""" + +import pytest + +from httpware._internal.redaction import redact_url + + +@pytest.mark.parametrize( + ("url", "expected"), + [ + # no-op cases (common-path guard: bytes unchanged) + ("https://example.test/path", "https://example.test/path"), + ("https://example.test/path?page=2&limit=10", "https://example.test/path?page=2&limit=10"), + ("not-a-url", "not-a-url"), + ("https://example.test", "https://example.test"), + # userinfo stripped + ("https://user:pass@example.test/p", "https://example.test/p"), + ("https://user:pass@example.test:8443/p", "https://example.test:8443/p"), + ("https://user:pass@[2001:db8::1]:8443/p", "https://[2001:db8::1]:8443/p"), + # sensitive query value masked, key + other params preserved + ("https://example.test/p?api_key=abc123", "https://example.test/p?api_key=REDACTED"), + ("https://example.test/p?page=2&access_token=xyz", "https://example.test/p?page=2&access_token=REDACTED"), + # case-insensitive key match + ("https://example.test/p?API_KEY=abc", "https://example.test/p?API_KEY=REDACTED"), + # userinfo AND query both handled + ("https://u:p@example.test/p?token=t", "https://example.test/p?token=REDACTED"), + ], +) +def test_redact_url(url: str, expected: str) -> None: + assert redact_url(url) == expected + + +def test_redact_url_masks_repeated_sensitive_keys() -> None: + result = redact_url("https://example.test/p?token=a&token=b&page=1") + assert "token=a" not in result + assert "token=b" not in result + assert result.count("token=REDACTED") == 2 # noqa: PLR2004 — two token params above + assert "page=1" in result + + +def test_redact_url_masks_blank_sensitive_value() -> None: + assert redact_url("https://example.test/p?secret=") == "https://example.test/p?secret=REDACTED" + + +def test_redact_url_masks_fragment_secret() -> None: + result = redact_url("https://example.test/p?page=2#access_token=topsecret") + assert "topsecret" not in result + assert "access_token=REDACTED" in result + assert "page=2" in result + + +def test_redact_url_preserves_benign_fragment() -> None: + assert redact_url("https://example.test/p#section-3") == "https://example.test/p#section-3" + + +def test_redact_url_masks_whitespace_padded_key() -> None: + # a space-padded sensitive key name must still be masked + result = redact_url("https://example.test/p?%20api_key=topsecret") + assert "topsecret" not in result + assert "REDACTED" in result diff --git a/tests/test_retry.py b/tests/test_retry.py index 1cf4055..70f6790 100644 --- a/tests/test_retry.py +++ b/tests/test_retry.py @@ -674,6 +674,21 @@ async def test_deposit_fires_once_per_call_not_per_attempt() -> None: assert budget.deposit_calls == 1, f"expected 1 deposit per request, got {budget.deposit_calls}" +async def test_retry_event_url_attribute_masks_query_secret(caplog: pytest.LogCaptureFixture) -> None: + """Resilience event `url` attributes must not leak query-string secrets.""" + sleeper = _SleepRecorder() + handler = _ResponseSequence([HTTPStatus.SERVICE_UNAVAILABLE] * 3) + client = _client(handler, retry=AsyncRetry(_sleep=sleeper, max_attempts=3, base_delay=0.001, max_delay=0.002)) + + with caplog.at_level(logging.WARNING, logger="httpware.retry"), pytest.raises(ServiceUnavailableError): + await client.get("https://example.test/x?api_key=topsecret") + + giving_up = [r for r in caplog.records if r.name == "httpware.retry" and r.message.startswith("retry gave up")] + assert len(giving_up) == 1 + assert "topsecret" not in giving_up[0].url # ty: ignore[unresolved-attribute] + assert "api_key=REDACTED" in giving_up[0].url # ty: ignore[unresolved-attribute] + + async def test_method_ineligible_with_streaming_body_does_not_attach_streaming_note() -> None: """POST with a streaming body that gets a 503 raises ServiceUnavailableError WITHOUT the streaming-note. diff --git a/tests/test_retry_sync.py b/tests/test_retry_sync.py index a9f8550..fe0803f 100644 --- a/tests/test_retry_sync.py +++ b/tests/test_retry_sync.py @@ -548,6 +548,21 @@ def test_retry_after_equal_to_max_delay_still_retries() -> None: assert handler.calls == 2 # noqa: PLR2004 — initial attempt + 1 retry +def test_retry_event_url_attribute_masks_query_secret_sync(caplog: pytest.LogCaptureFixture) -> None: + """Sync resilience event `url` attributes must not leak query-string secrets.""" + sleeper = _SleepRecorder() + handler = _ResponseSequence([HTTPStatus.SERVICE_UNAVAILABLE] * 3) + client = _client(handler, retry=Retry(_sleep=sleeper, max_attempts=3, base_delay=0.001, max_delay=0.002)) + + with caplog.at_level(logging.WARNING, logger="httpware.retry"), pytest.raises(ServiceUnavailableError): + client.get("https://example.test/x?api_key=topsecret") + + giving_up = [r for r in caplog.records if r.name == "httpware.retry" and r.message.startswith("retry gave up")] + assert len(giving_up) == 1 + assert "topsecret" not in giving_up[0].url # ty: ignore[unresolved-attribute] + assert "api_key=REDACTED" in giving_up[0].url # ty: ignore[unresolved-attribute] + + def test_method_ineligible_with_streaming_body_does_not_attach_streaming_note() -> None: """POST with a streaming body that gets a 503 raises WITHOUT the streaming-note (sync).