Skip to content

Commit 1d2a0e5

Browse files
authored
Merge pull request #42 from modern-python/refactor/decoder-instance-cache
refactor(decoders): per-instance cache replaces module lru_cache
2 parents bed715f + 70f4136 commit 1d2a0e5

7 files changed

Lines changed: 866 additions & 65 deletions

File tree

planning/engineering.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ The 0.1.0 seams numbered 1 (Middleware↔Transport) and 4 (Transport↔httpx2) h
4444
- `decode(content: bytes, model: type[T]) -> T` — the decode itself. Any exception is wrapped by `Client.send` / `AsyncClient.send` (when `response_model=` is set) and `Client.send_with_response` / `AsyncClient.send_with_response` into `httpware.DecodeError` (a `ClientError` subclass carrying `response`, `model`, `original`). Decoder implementers do not need to raise `DecodeError` directly.
4545
- **Pre-flight check:** when `response_model=` is set and no decoder claims it, `send` / `send_with_response` raise `MissingDecoderError(model=..., registered_names=...)` BEFORE the HTTP call. Distinct from `DecodeError` (which means the decoder ran and the payload was malformed); distinct corrective actions (install an extra or pass `decoders=[...]`).
4646
- **Default list:** `decoders=None` resolves via `client.py:_build_default_decoders()` against installed extras — pydantic-first when both are present, either-only when only one is installed, empty tuple when neither. `AsyncClient()` / `Client()` never raise on missing extras; failure surfaces only at the first `response_model=` use site.
47-
- **Rule:** the decoder must operate on raw bytes in a single parse pass. Two-pass decoding (`json.loads` then `validate_python`) is rejected: a single bytes-in / typed-object-out pass avoids the redundant intermediate `dict` allocation and parses faster. The Pydantic adapter implements this as `TypeAdapter(model).validate_json(content)`, with the `TypeAdapter` itself memoized via `@functools.lru_cache(maxsize=1024)` on a module-level `_get_adapter(model)` factory; the msgspec adapter mirrors the pattern with a cached `msgspec.json.Decoder(model)`.
47+
- **Rule:** the decoder must operate on raw bytes in a single parse pass. Two-pass decoding (`json.loads` then `validate_python`) is rejected: a single bytes-in / typed-object-out pass avoids the redundant intermediate `dict` allocation and parses faster. The Pydantic adapter implements this as `TypeAdapter(model).validate_json(content)`, with the `TypeAdapter` cached per-instance on `PydanticDecoder._adapters: dict[type, TypeAdapter]` (populated lazily on first `_get_adapter()` call); the msgspec adapter mirrors the pattern with `MsgspecDecoder._msgspec_decoders: dict[type, msgspec.json.Decoder]`. Cache lifetime matches the decoder/client, not the process — no module-level state, no autouse cache-clear fixtures in tests.
4848

4949
### Seam C: `httpware ↔ optional extras`
5050

planning/plans/2026-06-10-decoder-instance-cache-plan.md

Lines changed: 514 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 294 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,294 @@
1+
# Spec: decoder per-instance cache — drop module-level `@lru_cache`
2+
3+
**Date:** 2026-06-10
4+
**Topic slug:** `decoder-instance-cache`
5+
**Status:** drafted, awaiting user review
6+
**Target release:** folded into `0.9.0` (not yet tagged) — public API unchanged; internal refactor only.
7+
8+
## Purpose
9+
10+
Both built-in decoders cache per-model construction via module-level `@functools.lru_cache(maxsize=1024)`:
11+
12+
- `httpware.decoders.pydantic._get_adapter(model) -> TypeAdapter`
13+
- `httpware.decoders.msgspec._get_msgspec_decoder(model) -> msgspec.json.Decoder`
14+
15+
The lifecycle of these caches is *process-wide*, while the natural owner of the cache — the decoder instance — has a much narrower lifetime (one per `AsyncClient` / `Client`). The mismatch shows up in three places:
16+
17+
1. **Test fixture overhead.** Two autouse `cache_clear()` fixtures live in `tests/test_decoders_pydantic.py` and `tests/test_decoders_msgspec.py` to prevent cross-test pollution. The msgspec one was just added in PR #41 (Task 1 review-loop fix) to mirror the pydantic one.
18+
2. **Hidden global state.** `functools.lru_cache` internals are opaque; debugging "why is this adapter sticking around?" is harder than `decoder._adapters` would be.
19+
3. **`maxsize=1024` is a safety net for a problem that doesn't exist here.** Adapter counts are bounded by the number of `response_model=` types a client decodes, which is bounded by the application's surface area. Per-instance dicts grow with the decoder's lifetime and die with it — no bound needed.
20+
21+
This spec replaces both caches with per-instance `dict[type, ...]` attributes on each decoder. No public API change. Hot-path performance is preserved for the common case (one client per process) and slightly regresses for the rare multi-client-shared-models case.
22+
23+
## Non-goals
24+
25+
- **Cross-decoder cache sharing.** Out of scope. An opt-in `cache=` kwarg was considered and rejected (YAGNI — no documented use case).
26+
- **Changing the unhashable-model fallback.** `decode()` keeps the existing `try/except TypeError → uncached TypeAdapter(model)` pattern; dicts raise TypeError on unhashable keys, same as `lru_cache`.
27+
- **`MissingDecoderError`, `_dispatch_decoder`, default-decoder resolution, the `can_decode` contract.** All unchanged.
28+
- **The `msgspec.inspect.type_info` + `CustomType` filter in `MsgspecDecoder.can_decode`.** Stays exactly as-is — the principled deviation documented in `multi_decoder_routing_shipped` memory is orthogonal to cache mechanics.
29+
30+
## Architecture
31+
32+
### `PydanticDecoder` (`src/httpware/decoders/pydantic.py`)
33+
34+
Replace the module-level `_get_adapter` and the `@functools.lru_cache` decorator with a per-instance dict:
35+
36+
```python
37+
"""PydanticDecoder — ResponseDecoder backed by per-instance TypeAdapter cache.
38+
39+
Requires the `pydantic` extra: `pip install httpware[pydantic]`. Constructing
40+
`PydanticDecoder()` directly when pydantic is not installed raises ImportError.
41+
The default-decoder path in `client.py:_build_default_decoders()` skips this
42+
class entirely when `is_pydantic_installed` is False, so `AsyncClient()` does
43+
not trip the ImportError when the user is not using `response_model=`.
44+
"""
45+
46+
import typing
47+
from typing import TypeVar
48+
49+
from pydantic import TypeAdapter
50+
51+
from httpware._internal import import_checker
52+
53+
54+
MISSING_DEPENDENCY_MESSAGE = (
55+
"PydanticDecoder requires the 'pydantic' extra. Install with: pip install httpware[pydantic]"
56+
)
57+
58+
T = TypeVar("T")
59+
60+
61+
class PydanticDecoder:
62+
"""Decode raw response bytes into `model` via a per-instance cached `pydantic.TypeAdapter`."""
63+
64+
_adapters: dict[type, TypeAdapter[typing.Any]]
65+
66+
def __init__(self) -> None:
67+
if not import_checker.is_pydantic_installed:
68+
raise ImportError(MISSING_DEPENDENCY_MESSAGE)
69+
self._adapters = {}
70+
71+
def _get_adapter(self, model: type[T]) -> TypeAdapter[T]:
72+
adapter = self._adapters.get(model)
73+
if adapter is None:
74+
adapter = TypeAdapter(model)
75+
self._adapters[model] = adapter
76+
return adapter
77+
78+
def can_decode(self, model: type) -> bool:
79+
"""True iff pydantic can build a schema for `model`.
80+
81+
Probes via `_get_adapter`; subsequent calls (including `decode`) reuse
82+
the cached `TypeAdapter`. Rejects `msgspec.Struct` subclasses —
83+
pydantic raises `PydanticSchemaGenerationError` (a `TypeError`) when
84+
building a schema for them.
85+
"""
86+
try:
87+
self._get_adapter(model)
88+
except Exception: # noqa: BLE001 — can_decode is a probe; any failure means no
89+
return False
90+
return True
91+
92+
def decode(self, content: bytes, model: type[T]) -> T:
93+
"""Validate `content` as JSON against `model` in a single parse pass."""
94+
try:
95+
adapter = self._get_adapter(model)
96+
except TypeError:
97+
adapter = TypeAdapter(model)
98+
return adapter.validate_json(content)
99+
```
100+
101+
**Removals:**
102+
- `import functools`
103+
- Module-level `_get_adapter` function and its `@functools.lru_cache(maxsize=1024)` decorator.
104+
105+
**Type annotation:** `_adapters: dict[type, TypeAdapter[typing.Any]]` at class level (mirrors how the client class annotates `_decoders`). The `TypeAdapter[typing.Any]` is necessary because the dict stores adapters for many different `T` types; the per-method `T` narrowing happens through the `_get_adapter` signature.
106+
107+
### `MsgspecDecoder` (`src/httpware/decoders/msgspec.py`)
108+
109+
Same shape:
110+
111+
```python
112+
"""MsgspecDecoder — opt-in ResponseDecoder backed by a per-instance msgspec.json.Decoder cache."""
113+
114+
import typing
115+
from typing import TypeVar
116+
117+
from httpware._internal import import_checker
118+
119+
120+
if import_checker.is_msgspec_installed:
121+
import msgspec
122+
123+
124+
MISSING_DEPENDENCY_MESSAGE = "MsgspecDecoder requires the 'msgspec' extra. Install with: pip install httpware[msgspec]"
125+
126+
T = TypeVar("T")
127+
128+
129+
class MsgspecDecoder:
130+
"""Decode raw response bytes via a per-instance cached `msgspec.json.Decoder(model)`.
131+
132+
Requires the `msgspec` extra: `pip install httpware[msgspec]`. Importing
133+
this module without the extra works (the `msgspec` import is guarded by a
134+
`find_spec` check), but instantiating the decoder raises `ImportError`.
135+
"""
136+
137+
_msgspec_decoders: dict[type, "msgspec.json.Decoder[typing.Any]"]
138+
139+
def __init__(self) -> None:
140+
if not import_checker.is_msgspec_installed:
141+
raise ImportError(MISSING_DEPENDENCY_MESSAGE)
142+
self._msgspec_decoders = {}
143+
144+
def _get_msgspec_decoder(self, model: type[T]) -> "msgspec.json.Decoder[T]":
145+
decoder = self._msgspec_decoders.get(model)
146+
if decoder is None:
147+
decoder = msgspec.json.Decoder(model)
148+
self._msgspec_decoders[model] = decoder
149+
return decoder
150+
151+
def can_decode(self, model: type) -> bool:
152+
"""True iff msgspec natively understands `model`.
153+
154+
msgspec builds a Decoder for almost any class via a generic CustomType
155+
fallback; the Decoder constructor itself does NOT raise on unsupported
156+
types (e.g. pydantic.BaseModel). We use msgspec.inspect.type_info
157+
to detect the fallback and reject CustomType results explicitly.
158+
"""
159+
try:
160+
info = msgspec.inspect.type_info(model)
161+
except Exception: # noqa: BLE001 — can_decode is a probe; any failure means no
162+
return False
163+
if isinstance(info, msgspec.inspect.CustomType):
164+
return False
165+
try:
166+
self._get_msgspec_decoder(model)
167+
except Exception: # noqa: BLE001 — can_decode is a probe; any failure means no
168+
return False
169+
return True
170+
171+
def decode(self, content: bytes, model: type[T]) -> T:
172+
"""Validate `content` as JSON against `model` in a single parse pass."""
173+
try:
174+
decoder = self._get_msgspec_decoder(model)
175+
except TypeError:
176+
decoder = msgspec.json.Decoder(model)
177+
return decoder.decode(content)
178+
```
179+
180+
**Removals:**
181+
- `import functools`
182+
- Module-level `_get_msgspec_decoder` function and its `@functools.lru_cache(maxsize=1024)` decorator.
183+
184+
**Attribute name** is `_msgspec_decoders` (not `_decoders`) to avoid visual collision with `AsyncClient._decoders` / `Client._decoders` (which is the decoder *list*, not the per-model cache). Two attributes with the same name doing different things in adjacent files is a recipe for misreading.
185+
186+
### `PydanticDecoder.decode` and `MsgspecDecoder.decode` semantics
187+
188+
Unchanged. The `try/except TypeError` fallback to an uncached construction still covers:
189+
190+
- Unhashable `model` (e.g., `Annotated[int, some_unhashable_metadata]`) — `dict.get(model)` raises `TypeError` for unhashable keys, same as `lru_cache.__call__`.
191+
- Any failure inside the underlying constructor that the user wants to surface via `pydantic.ValidationError` / `msgspec.DecodeError` at the actual decode site, not as a `TypeError` during cache lookup.
192+
193+
## Tests
194+
195+
The 100% coverage gate is in force throughout (`pyproject.toml:93``--cov-fail-under=100`).
196+
197+
### Files touched
198+
199+
- `tests/test_decoders_pydantic.py`
200+
- `tests/test_decoders_msgspec.py`
201+
202+
No new test files. No deleted test files (the existing cache-invariance suite adapts mechanically).
203+
204+
### Removals
205+
206+
In `tests/test_decoders_pydantic.py`:
207+
- The autouse fixture `_clear_adapter_cache` (currently at lines 30-33). Gone — each test that needs a fresh cache constructs a fresh `PydanticDecoder()`, which has its own `_adapters` dict.
208+
- The `from httpware.decoders.pydantic import _get_adapter` import. Replaced with `PydanticDecoder` only.
209+
210+
In `tests/test_decoders_msgspec.py`:
211+
- The autouse fixture `_clear_msgspec_cache` (currently after the model definitions).
212+
- The `_get_msgspec_decoder` import. Replaced with `MsgspecDecoder` only.
213+
214+
### Migrations
215+
216+
For each existing cache-invariance test, the pattern shifts from "patch the module-level factory and assert spy count" to "construct a decoder, drive it, inspect `decoder._adapters` length OR patch `TypeAdapter` itself and count spy calls."
217+
218+
Concrete example. The old test:
219+
220+
```python
221+
def test_cache_invariance_single_model() -> None:
222+
_get_adapter.cache_clear()
223+
with patch("httpware.decoders.pydantic.TypeAdapter", wraps=pydantic.TypeAdapter) as spy:
224+
decoder = PydanticDecoder()
225+
for _ in range(1000):
226+
decoder.decode(b'{"id": 1, "name": "Ada"}', User)
227+
assert spy.call_count == 1
228+
```
229+
230+
The new test (identical body — the spy on `pydantic.TypeAdapter` is on the underlying constructor, not on the deleted module-level factory; the decoder instance is fresh so the cache starts empty):
231+
232+
```python
233+
def test_cache_invariance_single_model() -> None:
234+
with patch("httpware.decoders.pydantic.TypeAdapter", wraps=pydantic.TypeAdapter) as spy:
235+
decoder = PydanticDecoder()
236+
for _ in range(1000):
237+
decoder.decode(b'{"id": 1, "name": "Ada"}', User)
238+
assert spy.call_count == 1
239+
```
240+
241+
Drop the `_get_adapter.cache_clear()` line; the per-instance dict starts empty. Everything else is identical. Same for `test_cache_invariance_two_distinct_models`, `test_cache_invariance_concurrent_first_calls`, `test_cache_invariance_concurrent_first_calls_threadpool`.
242+
243+
The `test_unhashable_model_falls_back_to_uncached_adapter` test changes shape slightly. It currently patches `httpware.decoders.pydantic._get_adapter` to raise TypeError. After this spec, `_get_adapter` is a method on the decoder instance; the patch target becomes `PydanticDecoder._get_adapter`:
244+
245+
```python
246+
def test_unhashable_model_falls_back_to_uncached_adapter() -> None:
247+
decoder = PydanticDecoder()
248+
with patch.object(decoder, "_get_adapter", side_effect=TypeError("unhashable type")):
249+
result = decoder.decode(b"42", int)
250+
assert result == 42
251+
252+
with pytest.raises(pydantic.ValidationError):
253+
decoder.decode(b'"not-an-int"', int)
254+
```
255+
256+
(Construct one decoder, patch its method, drive it twice.)
257+
258+
The `test_pydantic_can_decode_uses_cache` test (added in PR #41 Task 1) currently asserts `_get_adapter.cache_info().hits >= 1`. After this spec, the assertion becomes "the same TypeAdapter instance is returned both times" OR "the decoder's `_adapters` dict has exactly one entry after two probes":
259+
260+
```python
261+
def test_pydantic_can_decode_uses_cache() -> None:
262+
decoder = PydanticDecoder()
263+
decoder.can_decode(User)
264+
decoder.can_decode(User)
265+
assert len(decoder._adapters) == 1
266+
assert User in decoder._adapters
267+
```
268+
269+
Same for `test_msgspec_can_decode_uses_cache`.
270+
271+
### Net test count
272+
273+
No new tests, no deleted tests. The cache-invariance test count stays the same; the autouse fixtures are removed (-2 lines × 2 files).
274+
275+
## Net diff estimate
276+
277+
- `src/httpware/decoders/pydantic.py`: ~-10 / +12 LOC.
278+
- `src/httpware/decoders/msgspec.py`: ~-12 / +14 LOC.
279+
- `tests/test_decoders_pydantic.py`: ~-6 / +3 LOC.
280+
- `tests/test_decoders_msgspec.py`: ~-6 / +3 LOC.
281+
282+
Total: ~50 LOC churn, no public API surface change, no behavior change for end users.
283+
284+
## Release impact
285+
286+
Folded into `0.9.0` (not yet tagged; multi-decoder routing PR #41 is the headline of that release, but the tag hasn't been cut). This spec is internal refactor only — no release notes line; no `!` commit subject; commit message `refactor(decoders): per-instance cache replaces module-level lru_cache`.
287+
288+
If 0.9.0 has already been tagged by the time this lands, retag the patch as `0.9.1` and surface "per-instance decoder cache" as a brief internal-cleanup note. No user-facing migration.
289+
290+
## Memory updates after merge
291+
292+
The [[msgspec_basemodel_customtype_quirk]] memory's code reference (`src/httpware/decoders/msgspec.py:can_decode`) stays valid — the `type_info` + `CustomType` filter survives this refactor verbatim. No update needed there.
293+
294+
The [[multi_decoder_routing_shipped]] memory mentions "cached `msgspec.json.Decoder(model)`" in passing under headline changes — should be amended to "per-instance cached" if a future reader cares; minor.

src/httpware/decoders/msgspec.py

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
"""MsgspecDecoder — opt-in ResponseDecoder backed by a cached msgspec.json.Decoder."""
1+
"""MsgspecDecoder — opt-in ResponseDecoder backed by a per-instance msgspec.json.Decoder cache."""
22

3-
import functools
3+
import typing
44
from typing import TypeVar
55

66
from httpware._internal import import_checker
@@ -15,32 +15,35 @@
1515
T = TypeVar("T")
1616

1717

18-
@functools.lru_cache(maxsize=1024)
19-
def _get_msgspec_decoder(model: type[T]) -> "msgspec.json.Decoder[T]":
20-
return msgspec.json.Decoder(model)
21-
22-
2318
class MsgspecDecoder:
24-
"""Decode raw response bytes via a cached `msgspec.json.Decoder(model)`.
19+
"""Decode raw response bytes via a per-instance cached `msgspec.json.Decoder(model)`.
2520
2621
Requires the `msgspec` extra: `pip install httpware[msgspec]`. Importing
2722
this module without the extra works (the `msgspec` import is guarded by a
2823
`find_spec` check), but instantiating the decoder raises `ImportError`.
2924
"""
3025

26+
_msgspec_decoders: dict[type, "msgspec.json.Decoder[typing.Any]"]
27+
3128
def __init__(self) -> None:
3229
if not import_checker.is_msgspec_installed:
3330
raise ImportError(MISSING_DEPENDENCY_MESSAGE)
31+
self._msgspec_decoders = {}
32+
33+
def _get_msgspec_decoder(self, model: type[T]) -> "msgspec.json.Decoder[T]":
34+
decoder = self._msgspec_decoders.get(model)
35+
if decoder is None:
36+
decoder = msgspec.json.Decoder(model)
37+
self._msgspec_decoders[model] = decoder
38+
return decoder
3439

3540
def can_decode(self, model: type) -> bool:
3641
"""Return True iff msgspec natively understands `model`.
3742
38-
Cached via `_get_msgspec_decoder`; subsequent calls reuse the same
39-
Decoder instance. Rejects `pydantic.BaseModel` subclasses — msgspec
40-
will *build* a Decoder for them (falling back to a generic
41-
`CustomType`) but cannot actually decode them without a `dec_hook`,
42-
so we use `msgspec.inspect.type_info` to detect the fallback and
43-
refuse to claim the model.
43+
msgspec builds a Decoder for almost any class via a generic CustomType
44+
fallback; the Decoder constructor itself does NOT raise on unsupported
45+
types (e.g. pydantic.BaseModel). We use msgspec.inspect.type_info
46+
to detect the fallback and reject CustomType results explicitly.
4447
"""
4548
try:
4649
info = msgspec.inspect.type_info(model)
@@ -49,15 +52,15 @@ def can_decode(self, model: type) -> bool:
4952
if isinstance(info, msgspec.inspect.CustomType):
5053
return False
5154
try:
52-
_get_msgspec_decoder(model)
55+
self._get_msgspec_decoder(model)
5356
except Exception: # noqa: BLE001 — can_decode is a probe; any failure means no
5457
return False
5558
return True
5659

5760
def decode(self, content: bytes, model: type[T]) -> T:
5861
"""Validate `content` as JSON against `model` in a single parse pass."""
5962
try:
60-
decoder = _get_msgspec_decoder(model)
63+
decoder = self._get_msgspec_decoder(model)
6164
except TypeError:
6265
decoder = msgspec.json.Decoder(model)
6366
return decoder.decode(content)

0 commit comments

Comments
 (0)