Skip to content

Commit 371b7bc

Browse files
committed
Merge TASK-056: hash-DoS hardening + prefix/exact terminus collision guard
Two security findings deferred during TASK-027 cleanup are now closed: - Switched the radix tree's per-segment child container from std::unordered_map to std::map (CWE-407 algorithmic-complexity DoS hardening; libc++ does not randomize unordered_map hashing). - Added an is_prefix_match guard at upsert_v2_param_route so prefix- vs-exact terminus collisions on the same canonical key are detected at registration time. bench_route_lookup verification is formally deferred to TASK-053.
2 parents ed37461 + 24d1b49 commit 371b7bc

14 files changed

Lines changed: 907 additions & 86 deletions

specs/architecture/04-components/route-table.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
**Implementation:** Three structures, queried in order:
66

77
1. **Hash map** `std::unordered_map<std::string, route_entry>` for **exact paths**. O(1) amortized lookup.
8-
2. **Radix tree** for **parameterized paths and prefix matches**. Single tree handles both cases (a prefix entry is a tree node marked as prefix-terminating; a parameterized segment is a wildcard child). O(L) lookup where L is path length.
8+
2. **Radix tree** for **parameterized paths and prefix matches**. Single tree handles both cases (a prefix entry is a tree node marked as prefix-terminating; a parameterized segment is a wildcard child). O(L) lookup where L is path length. Per-segment children at each radix node are stored in `std::map<std::string, std::unique_ptr<radix_node>, std::less<>>` — see the hash-flooding note below for why `std::map` and not `std::unordered_map`.
99
3. **Regex chain** `std::vector<std::pair<std::regex, route_entry>>` for **regex routes**. Linear fallback when neither hash nor radix matches.
1010

1111
A `route_entry` carries:
@@ -19,10 +19,14 @@ A `route_entry` carries:
1919

2020
**Lock order:** `route_table_mutex_` is acquired BEFORE `route_cache_mutex_` whenever both are held. The lookup pipeline never holds both at once: it walks the tier chain under a shared lock on the table, releases that lock, then takes the cache mutex briefly to install/promote the hit. Registration takes the table writer lock, releases it, and only then clears the cache.
2121

22-
**Future evolution:** if the radix tree starts to dominate lookup cost (measured), it can be replaced with a different data structure (compressed trie, perfect hash on a frozen route set) without touching the public API. v2.0 commits only to the *outer shape* (three-tier with cache), not the radix-tree implementation choice.
22+
**Prefix-vs-exact collision detection:** registering an exact route at a path that already has a prefix terminus (or vice versa) throws `std::invalid_argument` at registration time rather than silently double-registering. The cache key `(method, path)` cannot distinguish the two kinds at lookup time, so the conflict is rejected at the source. The guard probes both storage locations (`exact_routes_` and the radix tree's `exact_terminus_` / `prefix_terminus_`) before any mutation, so the atomicity contract — "a failed registration leaves the table exactly as it was" — still holds.
23+
24+
**Hash-flooding immunity (CWE-407):** the radix tree's per-segment children are kept in `std::map` rather than `std::unordered_map`. URL path segments are attacker-controlled, and neither libc++ nor libstdc++ seed `std::hash<std::string>` by default — a crafted sibling-key corpus can degrade `std::unordered_map::find` from O(1) amortized to O(n) per probe, opening an algorithmic-complexity DoS vector. The `std::map` (red-black tree) gives O(log n) worst case with no hashing in the loop. The transparent comparator `std::less<>` lets the hot-path lookup pass `std::string_view` keys directly without constructing a temporary `std::string` per probe. Typical URL trees branch shallowly (< 10 children per node), so the constant-factor difference from hashing is dominated by the per-segment string compare either way; the cache-miss radix path stays well under the 5 µs ceiling (to be enforced by `test/bench_route_lookup.cpp` once TASK-053 lands and wires `lookup_v2` into dispatch).
25+
26+
**Future evolution:** if `std::map` probe cost dominates measured lookup time at high fanout, switching to an in-tree small-vector flat_map remains an internal-only optimization. v2.0 commits only to the *outer shape* (three-tier with cache), not the per-node container choice.
2327

2428
**Related requirements:** PRD-HDL-REQ-002, PRD-HDL-REQ-004, PRD-HDL-REQ-006.
2529

26-
**Implementation status:** TASK-025 introduced `detail::route_entry` and the `lambda_resource` shim into the existing v1 three-map storage shape. TASK-027 wired `route_entry` into the full 3-tier table described above (hash map for exact paths, radix tree for parameterized/prefix paths, regex chain for regex routes). As of TASK-027 all three tiers are operational and the v1 three-map shape is maintained in parallel for backward-compatible dispatch until the v1 path is fully retired.
30+
**Implementation status:** TASK-025 introduced `detail::route_entry` and the `lambda_resource` shim into the existing v1 three-map storage shape. TASK-027 wired `route_entry` into the full 3-tier table described above (hash map for exact paths, radix tree for parameterized/prefix paths, regex chain for regex routes). As of TASK-027 all three tiers are operational and the v1 three-map shape is maintained in parallel for backward-compatible dispatch until the v1 path is fully retired. TASK-056 swapped the radix-node child container from `std::unordered_map` to `std::map<…, std::less<>>` for CWE-407 immunity and added registration-time detection of prefix-vs-exact terminus collisions (`reject_terminus_collision`).
2731

2832
---

specs/tasks/v2-deferred-backlog-plan.md

Lines changed: 24 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,13 @@ LRU cache + radix scan) is not realised end-to-end.
6666
/ `webserver_routes.cpp`.)
6767
- [ ] Run `test/bench_hook_overhead.cpp` and a new
6868
`test/bench_route_lookup.cpp` to confirm the cache-hit path is in the
69-
expected 100ns ballpark and the radix tier is in the µs range.
69+
expected 100ns ballpark and the radix tier is in the µs range. When
70+
creating `bench_route_lookup.cpp` also verify the TASK-056 acceptance
71+
criterion: the `std::map`-based radix node container introduced by
72+
TASK-056 must show no regression worse than 2× on the cache-miss radix
73+
path compared to the former `std::unordered_map` baseline (the 2× budget
74+
was formally deferred from TASK-056 to this task because `lookup_v2` was
75+
not yet wired into dispatch when TASK-056 landed).
7076
- [ ] Drop the "TODO(Cycle K): rename route_cache_v2 → route_lru_cache"
7177
comment in `webserver_impl.hpp:202` and do the rename now that v1 is
7278
gone.
@@ -222,20 +228,20 @@ TASK-027 cleanup:
222228
collide on the cache key.
223229

224230
**Action Items:**
225-
- [ ] Replace `std::unordered_map` with `std::map` (or a small
231+
- [x] Replace `std::unordered_map` with `std::map` (or a small
226232
flat_map) in the radix node child container. Benchmark the impact:
227233
acceptable if cache-miss path stays under 5 µs on the worst-case
228234
realistic tree depth.
229-
- [ ] Add the `is_prefix_match` guard at the call site of
235+
- [x] Add the `is_prefix_match` guard at the call site of
230236
`upsert_v2_param_route` so prefix-vs-exact terminus collisions are
231237
detected at registration time, not at lookup time.
232-
- [ ] Add a regression test
238+
- [x] Add a regression test
233239
`routing_regression_test.cpp::register_exact_after_prefix_does_not_collide`
234240
pinning the registration-time error.
235-
- [ ] Add a stress-test variant in `threadsafety_stress.cpp` that
241+
- [x] Add a stress-test variant in `threadsafety_stress.cpp` that
236242
hammers the registration path with adversarial path segments to
237243
confirm the new container is DoS-resistant.
238-
- [ ] Update `specs/architecture/04-components/routing.md` with the
244+
- [x] Update `specs/architecture/04-components/route-table.md` with the
239245
new container choice and rationale.
240246

241247
**Dependencies:**
@@ -245,14 +251,24 @@ TASK-027 cleanup:
245251
**Acceptance Criteria:**
246252
- `bench_route_lookup` shows no regression worse than 2× on the
247253
cache-miss radix path.
254+
**DEFERRED to TASK-053**`test/bench_route_lookup.cpp` does not
255+
exist yet; TASK-053 already owns creating that benchmark as part of
256+
wiring `lookup_v2` into the dispatch hot path. The 2× budget
257+
established here applies to the `std::map`-based container swap and
258+
must be verified by TASK-053's implementer against the TASK-056
259+
baseline. Creating the benchmark in TASK-056 would pre-empt TASK-053's
260+
scope and would benchmark a lookup path (`lookup_v2`) that is not yet
261+
wired into dispatch.
248262
- New regression test passes.
249-
- A 60-second adversarial-segment stress run completes without dispatch
250-
latency spikes > 10× baseline.
263+
- A 60-second adversarial-segment stress run completes without
264+
registration latency spikes > 10× baseline.
251265
- Routing architecture doc reflects the new container.
252266

253267
**Related Findings:** task-027 #14, task-027 #18
254268
**Related Decisions:** §4.5 routing
255269

270+
**Status:** Done
271+
256272
---
257273

258274
## TASK-057 — Redact credentials in `http_request::operator<<`

0 commit comments

Comments
 (0)