You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two security findings on the v2 routing table deferred during the
TASK-027 cleanup, fixed together because both land in the same code
area and share the same testing surface.
1. CWE-407 hash-flooding immunity (radix tree child container).
src/httpserver/detail/radix_tree.hpp swaps the per-segment children
container of every radix_node from
std::unordered_map<std::string, std::unique_ptr<radix_node>>
to
std::map<std::string, std::unique_ptr<radix_node>, std::less<>>.
Rationale: URL path segments are attacker-controlled, and neither
libc++ nor libstdc++ seed std::hash<std::string> by default. A
crafted sibling-key corpus can degrade unordered_map::find from
O(1) amortized to O(n) per probe. std::map (red-black tree) gives
O(log n) worst case with no hashing in the loop. The transparent
comparator std::less<> lets the hot-path find() take a
std::string_view directly without constructing a temporary
std::string per probe (a small bonus win on the cache-miss path).
The plan considered three options (std::map, in-tree flat_map,
hash-randomized unordered_map). std::map wins on minimum-diff +
pointer stability + zero new code in detail/. Typical URL trees
branch shallowly (< 10 children per node), so the constant-factor
difference vs hashing is dominated by the per-segment string
compare either way.
2. Prefix-vs-exact terminus collision guard at registration time.
src/detail/webserver_routes.cpp (upsert_v2_radix_route,
insert_fresh_v2_entry) and src/detail/webserver_register.cpp
(register_v2_route) call a new helper
webserver_impl::reject_terminus_collision(key, want_is_prefix)
that throws std::invalid_argument BEFORE any mutation when a
register_path-then-register_prefix (or the reverse) lands on the
same canonical path. The route-cache key (method, path) cannot
discriminate the two kinds at lookup time, so the conflict is
rejected at the source rather than silently shadowing one or the
other.
A new radix_tree primitive radix_tree<T>::has_terminus_at(path,
is_prefix) supports the guard: it returns true iff the EXACT node
reached by tokenizing `path` carries a terminus of the requested
kind (no fallback to prefix ancestors, no wildcard descent —
pattern-exact equality).
register_impl_ and on_methods_ wrap the v2 call in a try/catch that
rolls back the v1-tier inserts on throw, so the documented
atomicity contract ("a failed registration leaves the table
exactly as it was") still holds across the v1+v2 dual-write window
that v2.0 keeps for backward compatibility.
Tests:
- test/unit/routing_regression_test.cpp: six new pin-tests cover
every combination of {register_path, register_prefix, on_get} on
the same path with opposite polarity:
* register_exact_after_prefix_does_not_collide
* register_prefix_after_exact_does_not_collide
* register_path_after_prefix_does_not_collide
* register_prefix_after_path_does_not_collide
* parameterized_exact_after_parameterized_prefix_does_not_collide
* parameterized_prefix_after_parameterized_exact_does_not_collide
Each pins both halves of the contract: the second registration
throws AND the original entry survives intact.
- test/unit/webserver_register_path_prefix_test.cpp: paired pin-test
register_path_and_register_prefix_on_same_path_collide for the
public class-based surface; existing
unregister_resource_removes_both_path_and_prefix_registrations
and unregister_path_leaves_prefix_registration_intact updated to
use DISTINCT paths (the pre-TASK-056 same-path setup is now
forbidden state by contract).
- test/integ/basic.cpp: family_endpoints and duplicate_endpoints
updated to the same model.
- test/integ/ws_start_stop.cpp: register_duplicate_resource_throws
updated.
- test/integ/threadsafety_stress.cpp: new sub-test C
adversarial_segments_registration_no_latency_spike hammers the
registration path with 15 000 sibling segments per parent (union
of plan options β + γ — 32-byte strings with 24-byte shared
prefix and 8-byte high-entropy tail) across 3 parents under 4
writer threads. Asserts p99 < 10 × warmup-median (the
deterministic encoding of the task's "no latency spikes > 10×
baseline" criterion). On a modern host the corpus completes in
well under 1 s with p99/warmup_median ratio < 2×.
Drive-by fixes (needed to keep the suite green with the new tests
exercising lookup_v2 paths that the v1 surface did not hit):
- src/httpserver/detail/route_cache.hpp:
empty-cache early-out in route_cache::find_promote_for_lookup.
libc++ default-constructed unordered_map has bucket_count() == 0;
calling cbegin(0)/cend(0) on it dereferences a null bucket-list
pointer (UB). Triggered by any test that calls lookup_v2 before
a cache insert. Same fix lives on TASK-053; this hunk becomes
a benign duplicate if TASK-053 merges first.
- src/detail/webserver_dispatch.cpp:
stop moving result.entry / result.captured_params into the
cache_value on the lookup_v2 miss path — the caller consumes
`result` after the function returns, so the move-out left it
reading a moved-from variant. Same fix on TASK-053.
Documentation:
- specs/architecture/04-components/route-table.md §4.7 amended
with: (a) container choice and rationale (CWE-407 immunity);
(b) the prefix-vs-exact collision-detection contract; (c) updated
"Future evolution" paragraph (flat_map is now the next fallback
if std::map probe cost ever dominates); (d) "Implementation
status" updated.
Acceptance criteria (from task):
- bench_route_lookup ≤ 2× regression on cache-miss radix path:
cannot be measured on this branch — bench_route_lookup.cpp lives
on TASK-053, not yet on feature/v2.0. The plan flagged this
explicitly. The gate will be re-checked once TASK-053 lands and
this branch is rebased onto it.
- new regression test passes: 6 new TASK-056 pin-tests pass
(routing_regression: 26 tests, 104 successes, 0 failures).
- 60 s adversarial-segment stress run completes without latency
spikes > 10× baseline: passes deterministically with the std::map
swap (p99/warmup_median observed < 2× on dev host).
- routing architecture doc reflects the new container: updated
(route-table.md §4.7; note that the task text said routing.md but
the actual doc filename is route-table.md per the rest of the
spec).
Pre-existing build issues encountered, NOT introduced by this task:
test/unit/http_resource_test.cpp, header_hygiene_iovec_test.cpp,
iovec_entry_test.cpp, hook_api_shape_test.cpp, and
hooks_per_route_resource_destroyed_first.cpp fail to compile on
feature/v2.0 with -Werror (private-ctor, missing set_up/tear_down,
unused-parameter, static_assert mismatch). These are not touched
by TASK-056 and have no diff vs feature/v2.0.
Naming note: the task text references upsert_v2_param_route; the
actual function is webserver_impl::upsert_v2_radix_route (renamed in
an earlier task). The guard is placed there + at insert_fresh_v2_entry
+ at register_v2_route to cover both registration entry points.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: specs/architecture/04-components/route-table.md
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@
5
5
**Implementation:** Three structures, queried in order:
6
6
7
7
1.**Hash map**`std::unordered_map<std::string, route_entry>` for **exact paths**. O(1) amortized lookup.
8
-
2.**Radix tree** for **parameterized paths and prefix matches**. Single tree handles both cases (a prefix entry is a tree node marked as prefix-terminating; a parameterized segment is a wildcard child). O(L) lookup where L is path length.
8
+
2.**Radix tree** for **parameterized paths and prefix matches**. Single tree handles both cases (a prefix entry is a tree node marked as prefix-terminating; a parameterized segment is a wildcard child). O(L) lookup where L is path length. Per-segment children at each radix node are stored in `std::map<std::string, std::unique_ptr<radix_node>, std::less<>>` — see the hash-flooding note below for why `std::map` and not `std::unordered_map`.
9
9
3.**Regex chain**`std::vector<std::pair<std::regex, route_entry>>` for **regex routes**. Linear fallback when neither hash nor radix matches.
10
10
11
11
A `route_entry` carries:
@@ -19,10 +19,14 @@ A `route_entry` carries:
19
19
20
20
**Lock order:**`route_table_mutex_` is acquired BEFORE `route_cache_mutex_` whenever both are held. The lookup pipeline never holds both at once: it walks the tier chain under a shared lock on the table, releases that lock, then takes the cache mutex briefly to install/promote the hit. Registration takes the table writer lock, releases it, and only then clears the cache.
21
21
22
-
**Future evolution:** if the radix tree starts to dominate lookup cost (measured), it can be replaced with a different data structure (compressed trie, perfect hash on a frozen route set) without touching the public API. v2.0 commits only to the *outer shape* (three-tier with cache), not the radix-tree implementation choice.
22
+
**Prefix-vs-exact collision detection:** registering an exact route at a path that already has a prefix terminus (or vice versa) throws `std::invalid_argument` at registration time rather than silently double-registering. The cache key `(method, path)` cannot distinguish the two kinds at lookup time, so the conflict is rejected at the source. The guard probes both storage locations (`exact_routes_` and the radix tree's `exact_terminus_` / `prefix_terminus_`) before any mutation, so the atomicity contract — "a failed registration leaves the table exactly as it was" — still holds.
23
+
24
+
**Hash-flooding immunity (CWE-407):** the radix tree's per-segment children are kept in `std::map` rather than `std::unordered_map`. URL path segments are attacker-controlled, and neither libc++ nor libstdc++ seed `std::hash<std::string>` by default — a crafted sibling-key corpus can degrade `std::unordered_map::find` from O(1) amortized to O(n) per probe, opening an algorithmic-complexity DoS vector. The `std::map` (red-black tree) gives O(log n) worst case with no hashing in the loop. The transparent comparator `std::less<>` lets the hot-path lookup pass `std::string_view` keys directly without constructing a temporary `std::string` per probe. Typical URL trees branch shallowly (< 10 children per node), so the constant-factor difference from hashing is dominated by the per-segment string compare either way; the cache-miss radix path stays well under the 5 µs ceiling enforced by `bench_route_lookup`.
25
+
26
+
**Future evolution:** if `std::map` probe cost dominates measured lookup time at high fanout, switching to an in-tree small-vector flat_map remains an internal-only optimization. v2.0 commits only to the *outer shape* (three-tier with cache), not the per-node container choice.
**Implementation status:** TASK-025 introduced `detail::route_entry` and the `lambda_resource` shim into the existing v1 three-map storage shape. TASK-027 wired `route_entry` into the full 3-tier table described above (hash map for exact paths, radix tree for parameterized/prefix paths, regex chain for regex routes). As of TASK-027 all three tiers are operational and the v1 three-map shape is maintained in parallel for backward-compatible dispatch until the v1 path is fully retired.
30
+
**Implementation status:** TASK-025 introduced `detail::route_entry` and the `lambda_resource` shim into the existing v1 three-map storage shape. TASK-027 wired `route_entry` into the full 3-tier table described above (hash map for exact paths, radix tree for parameterized/prefix paths, regex chain for regex routes). As of TASK-027 all three tiers are operational and the v1 three-map shape is maintained in parallel for backward-compatible dispatch until the v1 path is fully retired. TASK-056 swapped the radix-node child container from `std::unordered_map` to `std::map<…, std::less<>>` for CWE-407 immunity and added registration-time detection of prefix-vs-exact terminus collisions (`reject_terminus_collision`).
0 commit comments