Skip to content

Commit 36416af

Browse files
committed
Merge TASK-058: hot-path allocation pass
Three perf cleanups deferred from the manual-validation sweep: step 1 — canonicalize_lookup_path returns string_view (zero-alloc fast path; scratch buffer for non-canonical inputs). step 2 — auth_skip_paths normalized at webserver construction, plus empty-list early-out (~620 ns -> ~3 ns per request in the production-typical case). Bonus: non-canonical skip-list entries now match canonical request paths. step 3 — lazy Allow-header cache on http_resource, invalidated implicitly via mask snapshot; std::shared_mutex so the warm path scales across worker threads. Plus a bench_warm_path harness (step 0) capturing before/after deltas for the three measured spots, and a TASK-058 validation iter1 pass folding in security-reviewer-iter1 #1/#2/#3 and performance-reviewer-iter1 #1.
2 parents 60eba6f + 2328eeb commit 36416af

17 files changed

Lines changed: 1353 additions & 59 deletions

specs/tasks/v2-deferred-backlog-plan.md

Lines changed: 34 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -336,6 +336,7 @@ aggregation pipeline.
336336
**Milestone:** post-v2.0 (polish; not a release blocker)
337337
**Component:** `webserver_dispatch` / `webserver_request`
338338
**Estimate:** L (broken into ~4 sub-items)
339+
**Status:** Done
339340

340341
**Goal:**
341342
Three perf items deferred during the manual-validation sweep all share
@@ -344,21 +345,50 @@ the same shape — a `std::string` is rebuilt on every request when a
344345
release-blocking but together they're worth 5–15% on the warm path.
345346

346347
**Action Items:**
347-
- [ ] **canonicalize_lookup_path → string_view return.** Refactor
348+
- [x] **canonicalize_lookup_path → string_view return.** Refactor
348349
`src/detail/webserver_dispatch.cpp::canonicalize_lookup_path` to
349350
return a `string_view` into a caller-owned scratch buffer (or the
350351
request arena) instead of allocating a `std::string` on every call.
351-
- [ ] **normalize_path at registration time.** Move the
352+
Done as step 1 (commit `e2f730d`): canonical-input fast path returns
353+
the input view unchanged (zero allocation); non-canonical inputs
354+
write into a caller-owned scratch `std::string`. Direct unit pin in
355+
`test/unit/route_lookup_canonicalize_test.cpp`.
356+
- [x] **normalize_path at registration time.** Move the
352357
`normalize_path` call from `webserver_request.cpp::should_skip_auth`
353358
(where it runs per request) to the registration site, storing the
354359
normalized form in `route_entry`. Update the lookup to use the
355360
pre-normalized form.
356-
- [ ] **serialize_allow_methods caching.** `webserver_dispatch.cpp:363`
361+
Done as step 2 (commit `51b5a37`). Interpretive deviation noted in
362+
the commit message: the brief's `route_entry` framing doesn't map
363+
onto `should_skip_auth`, which compares the request path against a
364+
webserver-level `auth_skip_paths` list (not a per-route attribute);
365+
the optimisation moves to the data's actual home —
366+
`auth_skip_paths_normalized_` populated once at webserver
367+
construction — plus an empty-list early-out for the production-
368+
typical case. The empty-list early-out drops the per-request cost
369+
from ~620 ns to ~3 ns. Bug-fix bonus: non-canonical skip-list
370+
entries (`"/public/"`, `"/a/../b"`) now match canonical request paths.
371+
Pinned in `test/unit/auth_skip_normalize_test.cpp`.
372+
- [x] **serialize_allow_methods caching.** `webserver_dispatch.cpp`
357373
rebuilds the Allow header value on every 405 response. Cache the
358374
serialized string on `route_entry` and invalidate on
359375
add/remove-method.
360-
- [ ] Add `test/bench_warm_path.cpp` measuring before/after per-request
376+
Done as step 3 (commit `b22b0dd`). Interpretive deviation noted in
377+
the commit message: the brief's `route_entry` framing would go stale
378+
on `set_allowing` because `route_entry::methods` is the *registered*
379+
mask, not the resource's runtime mask; the cache attaches to
380+
`http_resource` instead (the same object that owns the mask) and
381+
invalidates implicitly by comparing the live mask against a snapshot
382+
at cache-fill time. Per-resource `std::mutex`. Pinned in
383+
`test/unit/http_resource_allow_cache_test.cpp` (correctness +
384+
identity / cache-hit-returns-same-buffer pin).
385+
- [x] Add `test/bench_warm_path.cpp` measuring before/after per-request
361386
cost so the next reviewer can verify the gain.
387+
Done as step 0 (commit `18693a5`). Wired into `bench_targets` in
388+
`test/Makefile.am` (not part of `make check`). Four measurements:
389+
`canonicalize`, `should_skip_auth` (non-empty + empty), and
390+
`serialize_allow_405`. Baseline numbers captured in the step-0
391+
commit body; step-3 commit body holds the after-each-step deltas.
362392

363393
**Dependencies:**
364394
- Blocked by: TASK-053 (must be the only caller of `canonicalize_lookup_path`)

src/detail/webserver_dispatch.cpp

Lines changed: 54 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -123,25 +123,57 @@ void webserver_impl::invalidate_route_cache() {
123123
// would never share an entry. Matches the v1 dispatch path, which
124124
// constructs a non-registration http_endpoint at lookup time and so
125125
// gets the same normalization for free.
126-
static std::string canonicalize_lookup_path(const std::string& path) {
126+
//
127+
// TASK-058 step 1: return a string_view so the happy path (input
128+
// already canonical) allocates zero heap memory. On the rewrite
129+
// path the canonicalised form is written into the caller-owned
130+
// @p scratch buffer and a view into that buffer is returned.
131+
//
132+
// LIFETIME: the returned view is valid for the duration of the
133+
// call chain only; it points at either the immutable "/" literal,
134+
// the caller's @p path argument, or the caller's @p scratch buffer.
135+
// Any caller storing the view must copy it into an owning string
136+
// first. The cache layer (cache_key constructed below) already
137+
// copies via std::string, so the contract is naturally respected.
138+
static std::string_view canonicalize_lookup_path(
139+
std::string_view path, std::string& scratch) {
127140
if (path.empty()) {
128-
return "/";
141+
// Immutable canonical root -- no scratch usage.
142+
return std::string_view{"/", 1};
143+
}
144+
const bool has_leading_slash = (path.front() == '/');
145+
const bool has_trailing_slash = (path.size() > 1 && path.back() == '/');
146+
if (has_leading_slash && !has_trailing_slash) {
147+
// Already canonical: return the caller's view directly.
148+
return path;
129149
}
130-
std::string out = path;
131-
if (out.front() != '/') {
132-
out.insert(out.begin(), '/');
150+
// Rewrite path: write the canonicalised form into the caller's
151+
// scratch buffer and return a view into it.
152+
scratch.clear();
153+
scratch.reserve(path.size() + (has_leading_slash ? 0 : 1));
154+
if (!has_leading_slash) {
155+
scratch.push_back('/');
133156
}
134-
if (out.size() > 1 && out.back() == '/') {
135-
out.pop_back();
157+
if (has_trailing_slash) {
158+
scratch.append(path.data(), path.size() - 1);
159+
} else {
160+
scratch.append(path.data(), path.size());
136161
}
137-
return out;
162+
return std::string_view{scratch};
138163
}
139164

140165
webserver_impl::lookup_result
141166
webserver_impl::lookup_v2(http_method method, const std::string& path) {
142167
lookup_result result;
143168

144-
std::string lookup_path = canonicalize_lookup_path(path);
169+
// TASK-058 step 1: canonicalize_lookup_path returns a string_view
170+
// into either @p path (already-canonical happy path -- no
171+
// allocation), the immutable "/" literal (empty-input case), or
172+
// @p canonicalize_scratch (rewrite case -- single allocation,
173+
// bounded by the input size).
174+
std::string canonicalize_scratch;
175+
std::string_view lookup_path =
176+
canonicalize_lookup_path(path, canonicalize_scratch);
145177

146178
// Step 1: cache. Cache under the canonical key so /foo and /foo/
147179
// share an entry. Use find_by_view to avoid copying lookup_path
@@ -155,10 +187,11 @@ webserver_impl::lookup_v2(http_method method, const std::string& path) {
155187
result.captured_params = std::move(cached.captured_params);
156188
return result;
157189
}
158-
// Construct key by moving lookup_path into the cache_key, avoiding a
159-
// second heap allocation. All subsequent tier lookups use key.path,
160-
// which is the same std::string now owned by the key.
161-
cache_key key{method, std::move(lookup_path)};
190+
// Construct key by copying lookup_path into the cache_key. On the
191+
// miss path we pay one std::string construction; the warm-cache
192+
// path never reaches this line. All subsequent tier lookups use
193+
// key.path, which is the std::string now owned by the key.
194+
cache_key key{method, std::string(lookup_path)};
162195

163196
// Step 2: walk the tiers under a shared lock.
164197
{
@@ -420,9 +453,15 @@ void webserver_impl::dispatch_resource_handler(detail::modded_request* mr,
420453
}
421454
return;
422455
}
423-
// Method not allowed: serialize the allow-mask into a header.
456+
// Method not allowed: emit the Allow header. TASK-058 step 3:
457+
// read the resource's lazily-cached Allow header value instead
458+
// of rebuilding the string from the mask on every 405. The
459+
// cache lives on http_resource (the same object that owns the
460+
// mask), regenerates implicitly when the mask differs from the
461+
// last-cached snapshot, and is thread-safe under a per-resource
462+
// mutex held only across the cache-fill path.
424463
mr->response.emplace(method_not_allowed_page(mr));
425-
std::string header_value = serialize_allow_methods(hrm->get_allowed_methods());
464+
const std::string& header_value = hrm->get_allow_header();
426465
if (!header_value.empty()) {
427466
mr->response->with_header(http_utils::http_header_allow, header_value);
428467
}

src/detail/webserver_request.cpp

Lines changed: 85 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@
6767
#include "httpserver/detail/http_endpoint.hpp"
6868
#include "httpserver/detail/lambda_resource.hpp"
6969
#include "httpserver/detail/modded_request.hpp"
70+
#include "httpserver/detail/path_normalize.hpp"
7071
#include "httpserver/detail/resource_hook_table.hpp"
7172
#include "httpserver/http_request.hpp"
7273
#include "httpserver/http_resource.hpp"
@@ -136,16 +137,95 @@ std::string normalize_path(const std::string& path) {
136137

137138
} // namespace
138139

140+
// TASK-058 step 2: pre-normalize each auth_skip_paths entry once at
141+
// webserver construction time. Entries ending in "/*" keep their
142+
// wildcard suffix; the prefix before the wildcard is normalized.
143+
// Callers (webserver::webserver) pass the raw config-bag list and
144+
// store the result on the webserver instance as a sibling to the
145+
// original `auth_skip_paths` list. Pre-TASK-058 the skip list was
146+
// matched verbatim against a normalized request path, so non-
147+
// canonical inputs (e.g. "/public/", "/a/../b") silently never
148+
// matched -- the on-the-wire bug this normalization closes.
149+
//
150+
// security-reviewer-iter1-3: entries containing '%' are rejected with
151+
// std::invalid_argument. Skip-path entries must be provided in
152+
// decoded form (the same form as the request path after MHD's
153+
// unescaper runs). A '%'-encoded entry would never match a decoded
154+
// request path and would silently bypass auth for no route -- a
155+
// misconfiguration hazard caught early here.
156+
std::vector<std::string> normalize_auth_skip_paths(
157+
const std::vector<std::string>& raw) {
158+
std::vector<std::string> out;
159+
out.reserve(raw.size());
160+
for (const auto& entry : raw) {
161+
// Reject percent-encoded entries: skip-path entries must be
162+
// provided in decoded form. A '%' in the entry indicates a
163+
// URL-encoded sequence that would never match the decoded
164+
// request path produced by MHD's unescaper.
165+
if (entry.find('%') != std::string::npos) {
166+
throw std::invalid_argument(
167+
"auth_skip_paths entry contains a percent-encoded "
168+
"sequence ('" + entry + "'). "
169+
"Skip-path entries must be provided in decoded form "
170+
"(e.g. '/public/test', not '/public%2Ftest').");
171+
}
172+
// Wildcard suffix: strip the trailing "/*", normalize the
173+
// prefix, then re-append "/*". The special case "/*" (size
174+
// == 2) means "match every path" and is stored as-is so
175+
// should_skip_auth can recognise it with the >= 2 guard.
176+
if (entry.size() >= 2 && entry.back() == '*' &&
177+
entry[entry.size() - 2] == '/') {
178+
if (entry.size() == 2) {
179+
// "/*" -- global wildcard: matches every path.
180+
out.push_back("/*");
181+
} else {
182+
std::string prefix = entry.substr(0, entry.size() - 2);
183+
std::string normalized_prefix = normalize_path(prefix);
184+
if (normalized_prefix == "/") {
185+
// Prefix collapsed to root -- treat as "/*".
186+
out.push_back("/*");
187+
} else {
188+
out.push_back(normalized_prefix + "/*");
189+
}
190+
}
191+
continue;
192+
}
193+
out.push_back(normalize_path(entry));
194+
}
195+
return out;
196+
}
197+
139198
bool webserver_impl::should_skip_auth(const std::string& path) const {
199+
// TASK-058 step 2: empty-list early-out. Servers with no
200+
// auth_skip_paths configured pay zero normalization cost. This
201+
// is the production-typical case for any server whose auth
202+
// surface either covers every route or has no auth_handler at
203+
// all.
204+
if (parent->auth_skip_paths_normalized.empty()) {
205+
return false;
206+
}
207+
208+
// TASK-058 step 2: compare against the pre-normalized list (built
209+
// once at construction time) instead of re-normalizing skip-list
210+
// entries on every request. The per-request normalize_path call
211+
// on @p path remains -- the inbound URL is per-request data and
212+
// cannot be pre-normalized.
140213
std::string normalized = normalize_path(path);
141214

142-
for (const auto& skip_path : parent->auth_skip_paths) {
215+
for (const auto& skip_path : parent->auth_skip_paths_normalized) {
143216
if (skip_path == normalized) return true;
144-
// Support wildcard suffix (e.g., "/public/*")
145-
if (skip_path.size() > 2 && skip_path.back() == '*' &&
217+
// Support wildcard suffix (e.g., "/public/*").
218+
// security-reviewer-iter1-1: use >= 2 (not > 2) so the global
219+
// wildcard "/*" (size == 2) is handled. When skip_path is "/*"
220+
// the prefix is "/" and every normalized path starts with "/",
221+
// so we return true immediately for any request.
222+
if (skip_path.size() >= 2 && skip_path.back() == '*' &&
146223
skip_path[skip_path.size() - 2] == '/') {
147-
std::string prefix = skip_path.substr(0, skip_path.size() - 1);
148-
if (normalized.compare(0, prefix.size(), prefix) == 0) return true;
224+
std::string_view prefix(skip_path.data(), skip_path.size() - 1);
225+
if (normalized.compare(0, prefix.size(), prefix.data(),
226+
prefix.size()) == 0) {
227+
return true;
228+
}
149229
}
150230
}
151231
return false;

src/http_resource.cpp

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,12 @@
2828
#include <atomic>
2929
#include <functional>
3030
#include <memory>
31+
#include <shared_mutex>
3132
#include <stdexcept>
3233
#include <string>
3334
#include <utility>
3435

36+
#include "httpserver/detail/method_utils.hpp"
3537
#include "httpserver/detail/resource_hook_table.hpp"
3638
#include "httpserver/hook_action.hpp"
3739
#include "httpserver/hook_context.hpp"
@@ -97,6 +99,55 @@ ensure_table(std::shared_ptr<detail::resource_hook_table>& slot) {
9799

98100
} // namespace
99101

102+
// ---- copy / move special members ----------------------------------------
103+
//
104+
// TASK-058 step 3 added std::shared_mutex cached_allow_mutex_, which
105+
// has no copy or move. The defaulted special members on the public
106+
// class declaration would therefore be implicitly deleted. Implement
107+
// them here by-hand, skipping the mutex. The copy / move target starts
108+
// with a fresh, default-constructed mutex and an invalidated Allow-
109+
// header cache (cached_allow_valid_ = false), which the next call to
110+
// get_allow_header() repopulates lazily.
111+
112+
http_resource::http_resource(const http_resource& b) noexcept
113+
: methods_allowed_(b.methods_allowed_),
114+
hook_table_(b.hook_table_) {
115+
// cached_allow_mutex_ default-constructs.
116+
// cached_allow_header_ / cached_allow_mask_ default-construct.
117+
// cached_allow_valid_ stays false (member default).
118+
}
119+
120+
http_resource::http_resource(http_resource&& b) noexcept
121+
: methods_allowed_(b.methods_allowed_),
122+
hook_table_(std::move(b.hook_table_)) {
123+
// Same rationale as the copy constructor: cache state is per-
124+
// instance and is not transferred. std::shared_mutex has no move.
125+
}
126+
127+
http_resource& http_resource::operator=(const http_resource& b) noexcept {
128+
if (this != &b) {
129+
hook_table_ = b.hook_table_;
130+
methods_allowed_ = b.methods_allowed_;
131+
// Invalidate the local cache; do NOT touch the mutex (still
132+
// owned by *this).
133+
std::unique_lock<std::shared_mutex> lock(cached_allow_mutex_);
134+
cached_allow_valid_ = false;
135+
cached_allow_header_.clear();
136+
}
137+
return *this;
138+
}
139+
140+
http_resource& http_resource::operator=(http_resource&& b) noexcept {
141+
if (this != &b) {
142+
hook_table_ = std::move(b.hook_table_);
143+
methods_allowed_ = b.methods_allowed_;
144+
std::unique_lock<std::shared_mutex> lock(cached_allow_mutex_);
145+
cached_allow_valid_ = false;
146+
cached_allow_header_.clear();
147+
}
148+
return *this;
149+
}
150+
100151
// ---- add_hook overloads -------------------------------------------------
101152

102153
hook_handle http_resource::add_hook(hook_phase phase,
@@ -149,4 +200,53 @@ hook_handle http_resource::add_hook(hook_phase phase,
149200
hook_phase::request_completed, id};
150201
}
151202

203+
// ---- get_allow_header ---------------------------------------------------
204+
//
205+
// TASK-058 step 3: lazy Allow-header cache. See the header-side
206+
// declaration for the contract. Implementation:
207+
//
208+
// Warm path (cache hit, the common case after the first 405):
209+
// 1. Take a shared lock.
210+
// 2. Read methods_allowed_ INSIDE the lock (eliminates the TOCTOU
211+
// data race flagged in security-reviewer-iter1-2).
212+
// 3. Compare with the cached snapshot; if they match, return the
213+
// cached string by reference while still holding the shared lock.
214+
//
215+
// Cold / stale path (cache miss or mask changed):
216+
// 1. Release the shared lock.
217+
// 2. Take an exclusive (unique) lock.
218+
// 3. Re-check (double-checked locking) because another thread may
219+
// have filled the cache between the two lock acquisitions.
220+
// 4. Rebuild via detail::format_allow_header and update the snapshot.
221+
// 5. Return the cached string by reference.
222+
//
223+
// Using std::shared_mutex allows N concurrent warm-path readers without
224+
// serialisation; only the cold fill path serialises
225+
// (performance-reviewer-iter1-1).
226+
//
227+
// The returned reference is stable until the next mask mutation; the
228+
// caller (dispatch_resource_handler) consumes it synchronously within
229+
// the current dispatch, so the reference outlives use.
230+
const std::string& http_resource::get_allow_header() const {
231+
// Warm path: shared (read) lock.
232+
{
233+
std::shared_lock<std::shared_mutex> rlock(cached_allow_mutex_);
234+
const method_set live = methods_allowed_; // read inside lock
235+
if (cached_allow_valid_ && cached_allow_mask_ == live) {
236+
return cached_allow_header_;
237+
}
238+
}
239+
// Cold / stale path: exclusive (write) lock.
240+
std::unique_lock<std::shared_mutex> wlock(cached_allow_mutex_);
241+
// Double-checked: another thread may have filled the cache while we
242+
// waited for the exclusive lock.
243+
const method_set live = methods_allowed_; // re-read inside write lock
244+
if (!cached_allow_valid_ || cached_allow_mask_ != live) {
245+
cached_allow_header_ = detail::format_allow_header(live);
246+
cached_allow_mask_ = live;
247+
cached_allow_valid_ = true;
248+
}
249+
return cached_allow_header_;
250+
}
251+
152252
} // namespace httpserver

0 commit comments

Comments
 (0)