Skip to content

Commit b7cc53b

Browse files
lesnik512claude
andauthored
docs(decoders): add a "write your own ResponseDecoder" guide (#67)
* docs(decoders): add a "write your own ResponseDecoder" guide Add docs/decoders.md, the Seam B extension-point guide that mirrors docs/middleware.md: the can_decode/decode protocol (no-raise obligation, auto-DecodeError wrapping), decoders=[...] list-order resolution and the MissingDecoderError pre-flight, a sync-for-both-clients callout, a runnable CSV worked example, and the "claim narrowly / two-pass is allowed" note. Closes deferred item G6 (custom-decoder guide) and wires the page into the mkdocs nav after Middleware. No source or public-API change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore(planning): archive the custom-decoder-guide bundle (#67) Ship bookkeeping for PR #67: mark the change shipped (pr: 67, outcome filled), move the bundle from changes/active/ to changes/archive/, and flip its Index line from Active to Archived. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 9297ced commit b7cc53b

5 files changed

Lines changed: 235 additions & 2 deletions

File tree

docs/decoders.md

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# Writing a custom decoder
2+
3+
`httpware`'s typed-response extension point is the **`ResponseDecoder` protocol**. A decoder turns raw response bytes into a typed object: when you pass `response_model=` to `send` / `send_with_response`, the client walks its decoder list, picks the first one that claims your model, and hands it the body.
4+
5+
The built-in `PydanticDecoder` and `MsgspecDecoder` are themselves implementations of this protocol; nothing about them is privileged. Reach for a custom decoder when you need a body **format** the built-ins don't speak (CSV, XML, MessagePack, a bespoke binary frame) or a **type system** they don't cover (`attrs`, `marshmallow`, your own class hierarchy). If pydantic or msgspec already decodes your model, you don't need one — see [When NOT to write a decoder](#when-not-to-write-a-decoder).
6+
7+
## The protocol
8+
9+
One symbol, exported from `httpware`:
10+
11+
```python
12+
from typing import Protocol, TypeVar, runtime_checkable
13+
14+
T = TypeVar("T")
15+
16+
17+
@runtime_checkable
18+
class ResponseDecoder(Protocol):
19+
def can_decode(self, model: type) -> bool: ...
20+
def decode(self, content: bytes, model: type[T]) -> T: ...
21+
```
22+
23+
Two methods, two distinct jobs:
24+
25+
- **`can_decode(model) -> bool`** — the dispatch predicate. The client walks `decoders=[...]` in order and picks the **first** decoder that returns `True`. Claim every model you can actually handle (broad is correct — list ordering, not narrow predicates, encodes the caller's preference), but **reject another library's native types**: a CSV decoder has no business claiming a `pydantic.BaseModel`. `can_decode` **MUST NOT raise** — it runs at dispatch time, before the HTTP call and *outside* the `DecodeError` wrap that protects `decode`, so an exception here escapes `httpware`'s `ClientError` contract instead of being translated. A decoder that can't decide must return `False` (decline), not raise.
26+
- **`decode(content, model) -> T`** — the decode itself, raw response bytes in, a `model` instance out. Any exception you raise here is caught by the client and wrapped as `httpware.DecodeError` (carrying `response`, `model`, and the `original` exception). You do **not** need to raise `DecodeError` yourself — raise whatever your parser raises and let the seam translate it.
27+
28+
The protocol is `@runtime_checkable` and structural: any object with these two methods satisfies it. You do not subclass anything.
29+
30+
## How the client resolves a model
31+
32+
Both clients take `decoders: Sequence[ResponseDecoder] | None = None`, composed once at `__init__` and frozen for the client's lifetime.
33+
34+
- **Order is preference.** `decoders=[CsvDecoder(), PydanticDecoder()]` asks the CSV decoder first; pydantic only sees models CSV declined. List position is how you disambiguate a shape two decoders could both claim.
35+
- **`decoders=None`** resolves against installed extras — pydantic-first when both are present, either-only when one is, an empty tuple when neither. To *add* a decoder without losing the built-ins, list them explicitly: `decoders=[CsvDecoder(), PydanticDecoder()]`.
36+
- **No claimer is a pre-flight error.** When `response_model=` is set and no decoder claims it, the client raises `MissingDecoderError` **before** sending the request — you find out at wiring time, not after a wasted round-trip. This is distinct from `DecodeError`: `MissingDecoderError` means *nothing handles this model* (fix: install an extra or pass `decoders=[...]`); `DecodeError` means *a decoder ran and the payload was malformed* (fix: the server or the model). See [Errors](errors.md).
37+
38+
## Decoders are sync — for both clients
39+
40+
Unlike middleware, which has separate `AsyncMiddleware` and `Middleware` flavors, there is **one** `ResponseDecoder` protocol, shared by `AsyncClient` and `Client` alike. `decode` is a synchronous method: by the time it runs, the body has already been read off the wire, so decoding is pure CPU work with nothing to await. Write one decoder and pass it to either client.
41+
42+
## Worked example: a CSV decoder
43+
44+
A decoder for `text/csv` endpoints that returns a `list` of dataclass rows. Both built-ins are JSON, so this is the case they can't cover — and it shows the seam's real shape: raw bytes in, typed object out, no JSON anywhere.
45+
46+
```python
47+
import csv
48+
import dataclasses
49+
import io
50+
import typing
51+
52+
from httpware import AsyncClient
53+
from httpware.decoders.pydantic import PydanticDecoder
54+
55+
T = typing.TypeVar("T")
56+
57+
58+
class CsvDecoder:
59+
"""Decode a text/csv body into a list of dataclass rows.
60+
61+
Claims only `list[<dataclass>]`; declines everything else so the JSON
62+
decoders keep their models.
63+
"""
64+
65+
def can_decode(self, model: type) -> bool:
66+
if typing.get_origin(model) is not list:
67+
return False
68+
args = typing.get_args(model)
69+
return len(args) == 1 and dataclasses.is_dataclass(args[0])
70+
71+
def decode(self, content: bytes, model: type[T]) -> T:
72+
(row_type,) = typing.get_args(model)
73+
field_types = {f.name: f.type for f in dataclasses.fields(row_type)}
74+
reader = csv.DictReader(io.StringIO(content.decode("utf-8")))
75+
return [
76+
row_type(**{name: field_types[name](value) for name, value in row.items()})
77+
for row in reader
78+
]
79+
```
80+
81+
`can_decode` is total and never raises: a non-`list` model, a bare `list`, or `list[int]` all fall through to `False`. `decode` coerces each CSV cell with its field's type (CSV values arrive as strings) — a real decoder would handle optionals, dates, and missing columns; this is where your domain logic goes. Wire it ahead of the built-ins so it gets first refusal on `list[...]` models while pydantic still handles everything else:
82+
83+
```python
84+
@dataclasses.dataclass
85+
class Sale:
86+
id: int
87+
amount: float
88+
region: str
89+
90+
91+
async def main() -> None:
92+
async with AsyncClient(
93+
base_url="https://reports.example.com",
94+
decoders=[CsvDecoder(), PydanticDecoder()],
95+
) as client:
96+
sales = await client.send(
97+
client.build_request("GET", "/sales.csv"),
98+
response_model=list[Sale],
99+
)
100+
# sales: list[Sale]
101+
```
102+
103+
The same decoder instance works with a sync `Client(decoders=[CsvDecoder(), PydanticDecoder()])`.
104+
105+
## A note on claiming the right models
106+
107+
`can_decode` is a contract with the *rest of the list*. Claim too broadly and you steal models from decoders behind you; claim too narrowly and your decoder never runs. The rule of thumb: claim exactly the types you natively own, and reject another library's. An adapter for a third-party type system narrows its claim to that system — for example, a [`cattrs`](https://catt.rs)-backed decoder for `attrs` classes:
108+
109+
```python
110+
import json
111+
112+
import attrs
113+
114+
115+
class CattrsDecoder:
116+
def __init__(self, converter): # a configured cattrs.Converter
117+
self._converter = converter
118+
119+
def can_decode(self, model: type) -> bool:
120+
return attrs.has(model) # only attrs classes; everything else declines
121+
122+
def decode(self, content, model):
123+
return self._converter.structure(json.loads(content), model)
124+
```
125+
126+
Note this decoder is **two-pass** (`json.loads`, then `structure`). The built-in adapters deliberately decode in a single bytes-in pass (`TypeAdapter.validate_json`, `msgspec.json.Decoder.decode`) to skip the intermediate `dict` allocation — but that's a *performance choice for the built-ins*, not a protocol obligation. A custom decoder may go two-pass when its underlying library only structures from native Python objects; you pay one extra allocation, nothing more.
127+
128+
## When NOT to write a decoder
129+
130+
- **Your model is JSON.** Dataclasses, `TypedDict`s, primitives, pydantic models, and msgspec `Struct`s are all covered by the built-in `PydanticDecoder` / `MsgspecDecoder`. Install the extra (`httpware[pydantic]` or `httpware[msgspec]`) instead of writing a decoder.
131+
- **You only want raw bytes or text.** Don't pass `response_model=` at all — call `send` (or a verb method) without it and read `response.content` / `response.text` directly. Decoders are for *typed* bodies.
132+
- **The transform is per-call, not per-type.** If the shaping depends on the request rather than the model, it's a [middleware](middleware.md) concern, not a decoder.
133+
134+
## See also
135+
136+
- **[`architecture/decoders.md`](https://github.com/modern-python/httpware/blob/main/architecture/decoders.md) (Seam B)** — the formal protocol contract: dispatch order, the `can_decode` no-raise obligation, the single-pass rule, and the per-instance adapter cache.
137+
- **`src/httpware/decoders/pydantic.py` and `msgspec.py`** — the built-in adapters as reference implementations, including how they memoize a `can_decode` verdict and cache the underlying parser per model.
138+
- **[Quick-Start: typed responses](index.md)** — composing `response_model=` with the default decoder list.

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ edit_uri: edit/main/docs/
77
nav:
88
- Quick-Start: index.md
99
- Middleware: middleware.md
10+
- Decoders: decoders.md
1011
- Resilience: resilience.md
1112
- Errors: errors.md
1213
- Testing: testing.md

planning/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ _None._
7474

7575
### Archived (shipped)
7676

77+
- **[custom-decoder-guide](changes/archive/2026-06-15.01-custom-decoder-guide/change.md)** (#67, 2026-06-15) — Docs: a "write your own `ResponseDecoder`" guide for Seam B, mirroring `docs/middleware.md`. Closed deferred item G6.
7778
- **[audit-doc-fixes](changes/archive/2026-06-14.06-audit-doc-fixes/change.md)** (#66, 2026-06-14) — Closed the [deep-audit](audits/2026-06-14-deep-audit.md) doc-accuracy findings: `Client.stream()` docs, terminal-call attribution, the four auto-raise sites, the pydantic upper bound, and root import paths.
7879
- **[audit-test-quality](changes/archive/2026-06-14.05-audit-test-quality/change.md)** (#65, 2026-06-14) — Closed 11 [deep-audit](audits/2026-06-14-deep-audit.md) test-quality findings: sync-terminal + CookieConflict coverage, the `StatusError.__init__` invariant, missing status constructions, sync mirrors, typing overloads, a deterministic bulkhead barrier, a pinned budget clock, an observability assertion, and the `TimeoutError` circuit trigger.
7980
- **[audit-correctness](changes/archive/2026-06-14.04-audit-correctness/change.md)** (#64, 2026-06-14) — Closed 8 [deep-audit](audits/2026-06-14-deep-audit.md) correctness + public-API findings: RetryBudget token ordering, two `OverflowError` crashes, the redaction triple-slash, the msgspec guard, streaming-body symmetry, the RetryBudget docstring caveat, and `middleware/__all__`.
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
---
2+
status: shipped
3+
date: 2026-06-15
4+
slug: custom-decoder-guide
5+
supersedes: null
6+
superseded_by: null
7+
pr: 67
8+
outcome: Shipped docs/decoders.md (the Seam B "write your own ResponseDecoder" guide); closed deferred item G6.
9+
---
10+
11+
# Change: Add a "Writing a custom decoder" guide
12+
13+
**Lane:** lightweight — docs-only. New page + one-line nav edit + an
14+
`architecture/decoders.md` cross-link on ship. No source change, no public-API
15+
change. Mirrors the existing `docs/middleware.md` extension-seam guide.
16+
17+
Closes deferred item **G6** (custom-`ResponseDecoder` guide), the
18+
[2026-06-13 docs audit](../../../audits/2026-06-13-docs-audit.md) finding parked
19+
in [`deferred.md`](../../../deferred.md). Revisit trigger now met: the guide was
20+
explicitly requested.
21+
22+
## Goal
23+
24+
Seam B (`ResponseDecoder`) is a documented extension point, but unlike
25+
middleware it has no "write your own" guide. Add `docs/decoders.md` showing the
26+
`can_decode` / `decode` protocol, how `decoders=[...]` ordering resolves a
27+
model, and a worked custom-decoder example. Prose carries the signatures — no
28+
mkdocstrings / auto API reference (per the `2026-06-14.01` docs-UX decision).
29+
30+
## Approach
31+
32+
A prose + code-block page modeled on `docs/middleware.md`, the sibling
33+
"write your own" guide for Seam A. Sections, scaled to complexity:
34+
35+
1. **Intro / when to write one** — Seam B in a paragraph; reach for a custom
36+
decoder when you need a body *format* (non-JSON) or a *type system* the
37+
pydantic/msgspec built-ins don't cover.
38+
2. **The protocol** — the `ResponseDecoder` Protocol verbatim from
39+
`src/httpware/decoders/__init__.py`; `can_decode(model) -> bool`
40+
(first-match dispatch, claim broadly but reject other libraries' native
41+
types, **MUST NOT raise** — runs outside the `DecodeError` wrap, decline by
42+
returning False) and `decode(content: bytes, model) -> T` (raw bytes in;
43+
any exception is auto-wrapped as `DecodeError`, so don't raise it yourself).
44+
3. **How the client resolves a model**`decoders=[...]` order = preference,
45+
first claimer wins; `decoders=None` default is pydantic-first;
46+
`MissingDecoderError` fires *before* the HTTP call when nothing claims; the
47+
`MissingDecoderError` (no decoder) vs `DecodeError` (decoder ran, payload
48+
bad) distinction and their distinct corrective actions.
49+
4. **Sync, not async** (callout) — one sync protocol shared by `Client` *and*
50+
`AsyncClient`; there is no async `decode`, in contrast to middleware's two
51+
flavors. `decode` runs synchronously after the body is read.
52+
5. **Worked example: a CSV decoder**`text/csv` bytes → `list[<dataclass>]`.
53+
Chosen because both built-ins are JSON, so the highest-value lesson is that
54+
the seam is raw-bytes-in / typed-object-out and **not** JSON-bound. Stdlib
55+
`csv` only (a reader runs it with zero extra installs), naturally
56+
single-pass. `can_decode` claims `list[<dataclass>]` and rejects everything
57+
else; wired as `decoders=[CsvDecoder(), PydanticDecoder()]`.
58+
6. **A note on claiming the right models** — the `can_decode` discrimination
59+
obligation (claiming too broadly steals models from later decoders in the
60+
list); how an adapter for another type system (e.g. cattrs/attrs) narrows
61+
its claim to its own types; and that the single-pass rule is a *built-in
62+
performance choice*, not a hard protocol obligation — a custom decoder may
63+
go two-pass (`json.loads` → structure) at the cost of one extra allocation.
64+
7. **When NOT to write a decoder** — the built-ins already cover
65+
pydantic/msgspec/dataclasses/primitives; if you only want raw bytes or text,
66+
use `response.content` / `response.text` without `response_model=`.
67+
8. **See also**`architecture/decoders.md` (Seam B, the formal contract),
68+
the built-in adapters (`decoders/pydantic.py`, `decoders/msgspec.py`) as
69+
reference implementations, the Quick-Start typed-response example.
70+
71+
Truth home: [`architecture/decoders.md`](../../../../architecture/decoders.md)
72+
— Seam B's contract does not move; on ship, add a cross-link from it to the new
73+
guide.
74+
75+
## Files
76+
77+
- `docs/decoders.md` — new guide (the work).
78+
- `mkdocs.yml` — add `- Decoders: decoders.md` after the Middleware nav entry.
79+
- `planning/deferred.md` — remove the G6 entry (closed).
80+
81+
No `architecture/decoders.md` cross-link: `architecture/middleware.md` does not
82+
link to its `docs/middleware.md` guide either, so adding one only for decoders
83+
would break that symmetry. Seam B's contract is unchanged, so `architecture/`
84+
needs no promotion edit.
85+
86+
## Verification
87+
88+
- [ ] Every code block in the guide is runnable as written — the CSV
89+
`can_decode` predicate and `decode` body type-check and execute against
90+
the real `ResponseDecoder` protocol (manually exercised, not a doctest).
91+
- [ ] `uv run mkdocs build --strict` — clean (no broken internal links, nav
92+
resolves).
93+
- [ ] `just lint` — clean (eof-fixer / formatting on the new markdown).
94+
- [ ] Cross-references resolve: links to `architecture/decoders.md`,
95+
`middleware.md`, and `index.md` are valid.

planning/deferred.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,4 @@ As of 0.7.0, all planned epics (3, 4, 5, 6) are closed — see the [change Index
3636

3737
### Documentation
3838

39-
- **Custom-`ResponseDecoder` guide** (audit finding G6, [2026-06-13 docs audit](audits/2026-06-13-docs-audit.md)) — the decoder seam (Seam B) is a documented extension point, but unlike middleware it has no "write your own" guide. A short page would show the `can_decode(model: type) -> bool` / `decode(content: bytes, model: type[T]) -> T` protocol, how `decoders=[...]` ordering resolves a model, and a worked third-party-adapter example. Decided alongside the `2026-06-14.01` docs-UX restructure: **defer the guide, and ship no auto API reference / mkdocstrings** (prose carries the signatures). Demand-gated. Revisit trigger: someone asks how to write a custom decoder, a third-party decoder adapter ships, or the `decoders/` protocol surface changes. (`docs/`, `src/httpware/decoders/`)
40-
4139
- **Non-streaming hard response-body cap** (2026-06-14 deep audit, Medium) — for a non-streaming `send()`, httpx2 buffers the whole body before httpware reaches the decode seam, so a true cap needs a streaming-with-capped-accumulator rework of the Seam-A terminal. The current `max_error_body_bytes` guard only applies at `stream()` entry and only when `Content-Length` is declared. Revisit trigger: the Seam-A terminal is next reworked, or a concrete large-response abuse is reported. (`src/httpware/client.py`)

0 commit comments

Comments
 (0)