Skip to content

Commit 2328eeb

Browse files
etrclaude
andcommitted
TASK-058 validation iter1: address reviewer findings
- should_skip_auth (src/detail/webserver_request.cpp): switch the wildcard guard from size() > 2 to size() >= 2 so the global "/*" entry (size == 2) actually matches every path; also rewrite the prefix compare to use string_view to avoid a per-iteration substr allocation (security-reviewer-iter1-1). - normalize_auth_skip_paths: reject entries containing '%' with std::invalid_argument. Skip-path entries must be provided in decoded form -- a percent-encoded entry would never match MHD's decoded request path and silently bypass nothing, a quiet misconfiguration hazard (security-reviewer-iter1-3). The wildcard-prefix branch grows a special case for the literal "/*" so canonicalisation does not collapse it to "//*" or drop it. - get_allow_header (src/http_resource.cpp + .hpp): take the methods_allowed_ snapshot inside the mutex instead of before it to eliminate the TOCTOU data race (security-reviewer-iter1-2), and upgrade std::mutex to std::shared_mutex so concurrent warm- path readers no longer serialise -- only the cold fill path takes a unique_lock (performance-reviewer-iter1-1). Double- checked lock pattern: re-read methods_allowed_ under the unique lock so a fill that races a refill is idempotent. - bench_sizeof_http_resource: update the size-envelope algebra and comments for std::shared_mutex (~168 B on libc++, ~56 B on libstdc++). The static_assert ceiling is recomputed. - http_resource hpp / cpp comment blocks: refresh the cache contract description and the by-hand copy / move rationale now that the lock type is std::shared_mutex. - New unit pins: auth_skip_normalize_test: global_wildcard_skip_path_matches_any_path covers the size() >= 2 fix; percent_encoded_skip_path_throws_invalid_argument covers the misconfiguration-rejection contract. http_resource_allow_cache_test: concurrent_reads_all_return_correct_value exercises 8 threads x 1000 iterations to surface a TSAN race if methods_allowed_ ever drifts back outside the lock or the warm path is taken without a shared_lock. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 37ee732 commit 2328eeb

6 files changed

Lines changed: 211 additions & 74 deletions

File tree

src/detail/webserver_request.cpp

Lines changed: 43 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -146,26 +146,47 @@ std::string normalize_path(const std::string& path) {
146146
// matched verbatim against a normalized request path, so non-
147147
// canonical inputs (e.g. "/public/", "/a/../b") silently never
148148
// matched -- the on-the-wire bug this normalization closes.
149+
//
150+
// security-reviewer-iter1-3: entries containing '%' are rejected with
151+
// std::invalid_argument. Skip-path entries must be provided in
152+
// decoded form (the same form as the request path after MHD's
153+
// unescaper runs). A '%'-encoded entry would never match a decoded
154+
// request path and would silently bypass auth for no route -- a
155+
// misconfiguration hazard caught early here.
149156
std::vector<std::string> normalize_auth_skip_paths(
150157
const std::vector<std::string>& raw) {
151158
std::vector<std::string> out;
152159
out.reserve(raw.size());
153160
for (const auto& entry : raw) {
161+
// Reject percent-encoded entries: skip-path entries must be
162+
// provided in decoded form. A '%' in the entry indicates a
163+
// URL-encoded sequence that would never match the decoded
164+
// request path produced by MHD's unescaper.
165+
if (entry.find('%') != std::string::npos) {
166+
throw std::invalid_argument(
167+
"auth_skip_paths entry contains a percent-encoded "
168+
"sequence ('" + entry + "'). "
169+
"Skip-path entries must be provided in decoded form "
170+
"(e.g. '/public/test', not '/public%2Ftest').");
171+
}
154172
// Wildcard suffix: strip the trailing "/*", normalize the
155-
// prefix, then re-append "/*". Empty prefix (just "/*")
156-
// canonicalises to "/" + "/*" -> "//*", which would never
157-
// have matched anything sensible pre-fix either, so we keep
158-
// the literal "/*" unchanged (matches the v1 wildcard
159-
// behaviour at the dispatch site).
160-
if (entry.size() > 2 && entry.back() == '*' &&
173+
// prefix, then re-append "/*". The special case "/*" (size
174+
// == 2) means "match every path" and is stored as-is so
175+
// should_skip_auth can recognise it with the >= 2 guard.
176+
if (entry.size() >= 2 && entry.back() == '*' &&
161177
entry[entry.size() - 2] == '/') {
162-
std::string prefix = entry.substr(0, entry.size() - 2);
163-
std::string normalized_prefix = normalize_path(prefix);
164-
if (normalized_prefix == "/") {
165-
// "/" + "/*" -> "/*" (keep literal).
178+
if (entry.size() == 2) {
179+
// "/*" -- global wildcard: matches every path.
166180
out.push_back("/*");
167181
} else {
168-
out.push_back(normalized_prefix + "/*");
182+
std::string prefix = entry.substr(0, entry.size() - 2);
183+
std::string normalized_prefix = normalize_path(prefix);
184+
if (normalized_prefix == "/") {
185+
// Prefix collapsed to root -- treat as "/*".
186+
out.push_back("/*");
187+
} else {
188+
out.push_back(normalized_prefix + "/*");
189+
}
169190
}
170191
continue;
171192
}
@@ -193,11 +214,18 @@ bool webserver_impl::should_skip_auth(const std::string& path) const {
193214

194215
for (const auto& skip_path : parent->auth_skip_paths_normalized) {
195216
if (skip_path == normalized) return true;
196-
// Support wildcard suffix (e.g., "/public/*")
197-
if (skip_path.size() > 2 && skip_path.back() == '*' &&
217+
// Support wildcard suffix (e.g., "/public/*").
218+
// security-reviewer-iter1-1: use >= 2 (not > 2) so the global
219+
// wildcard "/*" (size == 2) is handled. When skip_path is "/*"
220+
// the prefix is "/" and every normalized path starts with "/",
221+
// so we return true immediately for any request.
222+
if (skip_path.size() >= 2 && skip_path.back() == '*' &&
198223
skip_path[skip_path.size() - 2] == '/') {
199-
std::string prefix = skip_path.substr(0, skip_path.size() - 1);
200-
if (normalized.compare(0, prefix.size(), prefix) == 0) return true;
224+
std::string_view prefix(skip_path.data(), skip_path.size() - 1);
225+
if (normalized.compare(0, prefix.size(), prefix.data(),
226+
prefix.size()) == 0) {
227+
return true;
228+
}
201229
}
202230
}
203231
return false;

src/http_resource.cpp

Lines changed: 39 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
#include <atomic>
2929
#include <functional>
3030
#include <memory>
31+
#include <shared_mutex>
3132
#include <stdexcept>
3233
#include <string>
3334
#include <utility>
@@ -100,10 +101,10 @@ ensure_table(std::shared_ptr<detail::resource_hook_table>& slot) {
100101

101102
// ---- copy / move special members ----------------------------------------
102103
//
103-
// TASK-058 step 3 added std::mutex cached_allow_mutex_, which has no
104-
// copy or move. The defaulted special members on the public class
105-
// declaration would therefore be implicitly deleted. Implement them
106-
// here by-hand, skipping the mutex. The copy / move target starts
104+
// TASK-058 step 3 added std::shared_mutex cached_allow_mutex_, which
105+
// has no copy or move. The defaulted special members on the public
106+
// class declaration would therefore be implicitly deleted. Implement
107+
// them here by-hand, skipping the mutex. The copy / move target starts
107108
// with a fresh, default-constructed mutex and an invalidated Allow-
108109
// header cache (cached_allow_valid_ = false), which the next call to
109110
// get_allow_header() repopulates lazily.
@@ -120,7 +121,7 @@ http_resource::http_resource(http_resource&& b) noexcept
120121
: methods_allowed_(b.methods_allowed_),
121122
hook_table_(std::move(b.hook_table_)) {
122123
// Same rationale as the copy constructor: cache state is per-
123-
// instance and is not transferred. std::mutex has no move.
124+
// instance and is not transferred. std::shared_mutex has no move.
124125
}
125126

126127
http_resource& http_resource::operator=(const http_resource& b) noexcept {
@@ -129,7 +130,7 @@ http_resource& http_resource::operator=(const http_resource& b) noexcept {
129130
methods_allowed_ = b.methods_allowed_;
130131
// Invalidate the local cache; do NOT touch the mutex (still
131132
// owned by *this).
132-
std::lock_guard<std::mutex> lock(cached_allow_mutex_);
133+
std::unique_lock<std::shared_mutex> lock(cached_allow_mutex_);
133134
cached_allow_valid_ = false;
134135
cached_allow_header_.clear();
135136
}
@@ -140,7 +141,7 @@ http_resource& http_resource::operator=(http_resource&& b) noexcept {
140141
if (this != &b) {
141142
hook_table_ = std::move(b.hook_table_);
142143
methods_allowed_ = b.methods_allowed_;
143-
std::lock_guard<std::mutex> lock(cached_allow_mutex_);
144+
std::unique_lock<std::shared_mutex> lock(cached_allow_mutex_);
144145
cached_allow_valid_ = false;
145146
cached_allow_header_.clear();
146147
}
@@ -204,22 +205,42 @@ hook_handle http_resource::add_hook(hook_phase phase,
204205
// TASK-058 step 3: lazy Allow-header cache. See the header-side
205206
// declaration for the contract. Implementation:
206207
//
207-
// 1. Snapshot the live mask once -- subsequent comparisons run against
208-
// this local copy (avoids re-loading methods_allowed_ inside the
209-
// critical section).
210-
// 2. Take the per-resource mutex. Under contention this serialises
211-
// cache-fill; under the warm-path (hit) case the lock is held for
212-
// a single integer compare + reference return.
213-
// 3. If the cached snapshot matches the live mask, return the cached
214-
// string by reference. Otherwise rebuild via detail::format_allow_header
215-
// (the same routine the pre-TASK-058 path used) and update the snapshot.
208+
// Warm path (cache hit, the common case after the first 405):
209+
// 1. Take a shared lock.
210+
// 2. Read methods_allowed_ INSIDE the lock (eliminates the TOCTOU
211+
// data race flagged in security-reviewer-iter1-2).
212+
// 3. Compare with the cached snapshot; if they match, return the
213+
// cached string by reference while still holding the shared lock.
214+
//
215+
// Cold / stale path (cache miss or mask changed):
216+
// 1. Release the shared lock.
217+
// 2. Take an exclusive (unique) lock.
218+
// 3. Re-check (double-checked locking) because another thread may
219+
// have filled the cache between the two lock acquisitions.
220+
// 4. Rebuild via detail::format_allow_header and update the snapshot.
221+
// 5. Return the cached string by reference.
222+
//
223+
// Using std::shared_mutex allows N concurrent warm-path readers without
224+
// serialisation; only the cold fill path serialises
225+
// (performance-reviewer-iter1-1).
216226
//
217227
// The returned reference is stable until the next mask mutation; the
218228
// caller (dispatch_resource_handler) consumes it synchronously within
219229
// the current dispatch, so the reference outlives use.
220230
const std::string& http_resource::get_allow_header() const {
221-
const method_set live = methods_allowed_;
222-
std::lock_guard<std::mutex> lock(cached_allow_mutex_);
231+
// Warm path: shared (read) lock.
232+
{
233+
std::shared_lock<std::shared_mutex> rlock(cached_allow_mutex_);
234+
const method_set live = methods_allowed_; // read inside lock
235+
if (cached_allow_valid_ && cached_allow_mask_ == live) {
236+
return cached_allow_header_;
237+
}
238+
}
239+
// Cold / stale path: exclusive (write) lock.
240+
std::unique_lock<std::shared_mutex> wlock(cached_allow_mutex_);
241+
// Double-checked: another thread may have filled the cache while we
242+
// waited for the exclusive lock.
243+
const method_set live = methods_allowed_; // re-read inside write lock
223244
if (!cached_allow_valid_ || cached_allow_mask_ != live) {
224245
cached_allow_header_ = detail::format_allow_header(live);
225246
cached_allow_mask_ = live;

src/httpserver/http_resource.hpp

Lines changed: 32 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
#include <functional>
2929
#include <memory>
3030
#include <mutex>
31+
#include <shared_mutex>
3132
#include <string>
3233

3334
// TASK-036: render_* virtuals now return http_response by value; the
@@ -234,9 +235,12 @@ class http_resource {
234235
* disallow_all, and allow_all all invalidate the cache implicitly,
235236
* without any explicit dependency on the mutation API.
236237
*
237-
* Thread-safe: a per-resource std::mutex serialises the cache
238-
* fill / read. Once the cache is warm, the lock is uncontended
239-
* on every subsequent call.
238+
* Thread-safe: a per-resource std::shared_mutex allows concurrent
239+
* warm-path reads (std::shared_lock) while the cache-fill path
240+
* takes an exclusive std::unique_lock. Under multi-threaded 405
241+
* floods the warm path is fully concurrent; only a stale-mask
242+
* fill serialises. The methods_allowed_ snapshot is taken inside
243+
* the lock to eliminate the TOCTOU data race.
240244
*
241245
* The returned reference is valid until the next mask mutation;
242246
* callers must not store it beyond a single dispatch invocation.
@@ -308,14 +312,15 @@ class http_resource {
308312
* are needed, register hooks on the copy separately after construction.
309313
*
310314
* TASK-058 step 3: cached_allow_mutex_ is non-copyable / non-movable
311-
* (std::mutex has no copy or move). The copy / move special members
312-
* are therefore written by hand and skip the mutex member entirely --
313-
* the copy / move target gets a freshly default-constructed mutex
314-
* and an invalidated cache (cached_allow_valid_ = false), so the
315-
* next get_allow_header() call on the target rebuilds the cache
316-
* under its own (fresh) lock. This is the correct semantic: copy /
317-
* move is a structural operation; cache state is local to each
318-
* instance and must not be shared by reference.
315+
* (std::shared_mutex has no copy or move). The copy / move special
316+
* members are therefore written by hand and skip the mutex member
317+
* entirely -- the copy / move target gets a freshly
318+
* default-constructed mutex and an invalidated cache
319+
* (cached_allow_valid_ = false), so the next get_allow_header()
320+
* call on the target rebuilds the cache under its own (fresh) lock.
321+
* This is the correct semantic: copy / move is a structural
322+
* operation; cache state is local to each instance and must not be
323+
* shared by reference.
319324
**/
320325
http_resource(const http_resource& b) noexcept;
321326
http_resource(http_resource&& b) noexcept;
@@ -361,14 +366,17 @@ class http_resource {
361366
//
362367
// Thread-safety: the dispatch path can call get_allow_header()
363368
// concurrently from multiple MHD worker threads against the same
364-
// resource. The mutex serialises the cache fill / read; once the
365-
// cache is warm, the lock is uncontended on every subsequent 405.
369+
// resource. std::shared_mutex allows concurrent warm-path reads
370+
// (std::shared_lock) while the cache-fill (miss/invalidate) path
371+
// takes an exclusive std::unique_lock. The methods_allowed_
372+
// snapshot is taken INSIDE the lock (not before it) to eliminate
373+
// the TOCTOU data race noted in security-reviewer-iter1-2.
366374
//
367375
// `mutable` for the same reason as hook_table_: the dispatch path
368376
// calls get_allow_header() on a `const http_resource&` (the cache
369377
// is logically const-observable -- the visible result is a stable
370378
// function of methods_allowed_).
371-
mutable std::mutex cached_allow_mutex_;
379+
mutable std::shared_mutex cached_allow_mutex_;
372380
mutable std::string cached_allow_header_;
373381
mutable method_set cached_allow_mask_{};
374382
mutable bool cached_allow_valid_ = false;
@@ -399,21 +407,21 @@ class http_resource {
399407
// table PIMPL (shared_ptr<resource_hook_table>) was added later, growing the
400408
// cap to vptr + shared_ptr + method_set + padding.
401409
//
402-
// TASK-058 step 3 added a lazy Allow-header cache: std::mutex +
403-
// std::string + method_set + bool. std::mutex is a sizeof~64 storage on
404-
// pthread platforms (libc++/macOS) and ~40 on libstdc++/Linux; std::string
405-
// SBO adds ~24-32; the method_set / bool fields slot into existing padding.
406-
// The new ceiling is the old (3*void* + 2*method_set) plus the new cache
407-
// payload, padded for alignment. See test/bench_sizeof_http_resource.cpp
408-
// for the v1-anchored algebra; the value here documents the
409-
// post-TASK-058 layout shape.
410+
// TASK-058 step 3 added a lazy Allow-header cache: std::shared_mutex +
411+
// std::string + method_set + bool. std::shared_mutex is a sizeof~56
412+
// storage on pthread platforms (libc++/macOS) and ~56 on libstdc++/Linux
413+
// (similar to std::mutex); std::string SBO adds ~24-32; the method_set /
414+
// bool fields slot into existing padding. The new ceiling is the old
415+
// (3*void* + 2*method_set) plus the new cache payload, padded for
416+
// alignment. See test/bench_sizeof_http_resource.cpp for the v1-anchored
417+
// algebra; the value here documents the post-TASK-058 layout shape.
410418
static_assert(sizeof(http_resource) <=
411419
sizeof(void*) * 3 + sizeof(method_set) * 2
412-
+ sizeof(std::mutex) + sizeof(std::string)
420+
+ sizeof(std::shared_mutex) + sizeof(std::string)
413421
+ sizeof(method_set) + sizeof(bool) * 8,
414422
"http_resource should be approximately vptr + shared_ptr + "
415423
"method_set (TASK-051) + Allow-header cache "
416-
"(mutex + string + method_set + bool) after TASK-058 step 3");
424+
"(shared_mutex + string + method_set + bool) after TASK-058 step 3");
417425

418426
} // namespace httpserver
419427
#endif // SRC_HTTPSERVER_HTTP_RESOURCE_HPP_

test/bench_sizeof_http_resource.cpp

Lines changed: 24 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232

3333
#include <cstddef>
3434
#include <mutex>
35+
#include <shared_mutex>
3536
#include <string>
3637

3738
#include "httpserver/http_method.hpp"
@@ -53,24 +54,29 @@ using httpserver::v1_baseline::V1_STD_MAP_STRING_BOOL_SIZEOF;
5354
// reduction := V1_HTTP_RESOURCE_SIZEOF - sizeof(http_resource)
5455
// must satisfy
5556
// reduction + sizeof(method_set) * 2 + sizeof(void*) * 2
56-
// + sizeof(std::mutex) + sizeof(std::string)
57+
// + sizeof(std::shared_mutex) + sizeof(std::string)
5758
// + sizeof(method_set) + sizeof(bool) * 8
5859
// >= V1_STD_MAP_STRING_BOOL_SIZEOF
5960
//
6061
// Terms:
61-
// sizeof(method_set) * 2 -- own_size + worst-case alignment padding
62-
// for the methods_allowed_ member.
63-
// sizeof(void*) * 2 -- shared_ptr<resource_hook_table> hook_table_
64-
// member (TASK-051 hook PIMPL slot).
65-
// sizeof(std::mutex) -- TASK-058 step 3: per-resource Allow-header
66-
// cache lock. pthread_mutex_t on macOS
67-
// (libc++) is ~64 bytes; libstdc++ is ~40.
68-
// sizeof(std::string) -- TASK-058 step 3: cached Allow header
69-
// storage. SBO header is ~24-32 bytes.
70-
// sizeof(method_set) -- TASK-058 step 3: cached mask snapshot.
71-
// sizeof(bool) * 8 -- TASK-058 step 3: cached_allow_valid_ + worst-
72-
// case alignment padding (8 bytes; mutex
73-
// forces 8-byte alignment after a single bool).
62+
// sizeof(method_set) * 2 -- own_size + worst-case alignment padding
63+
// for the methods_allowed_ member.
64+
// sizeof(void*) * 2 -- shared_ptr<resource_hook_table> hook_table_
65+
// member (TASK-051 hook PIMPL slot).
66+
// sizeof(std::shared_mutex) -- TASK-058 step 3 / security-reviewer-iter1-2
67+
// / performance-reviewer-iter1-1: per-resource
68+
// Allow-header cache lock upgraded from
69+
// std::mutex to std::shared_mutex to allow
70+
// concurrent warm-path reads.
71+
// macOS (libc++): ~168 bytes.
72+
// Linux (libstdc++): ~56 bytes.
73+
// sizeof(std::string) -- TASK-058 step 3: cached Allow header
74+
// storage. SBO header is ~24-32 bytes.
75+
// sizeof(method_set) -- TASK-058 step 3: cached mask snapshot.
76+
// sizeof(bool) * 8 -- TASK-058 step 3: cached_allow_valid_ +
77+
// worst-case alignment padding (8 bytes;
78+
// mutex forces 8-byte alignment after a
79+
// single bool).
7480
//
7581
// The V1_* constants are selected at compile time by the detected
7682
// C++ standard library (libc++ vs. libstdc++) in v1_constants.hpp,
@@ -84,18 +90,19 @@ using httpserver::v1_baseline::V1_STD_MAP_STRING_BOOL_SIZEOF;
8490
static_assert(sizeof(http_resource) + V1_STD_MAP_STRING_BOOL_SIZEOF
8591
<= V1_HTTP_RESOURCE_SIZEOF + sizeof(method_set) * 2
8692
+ sizeof(void*) * 2
87-
+ sizeof(std::mutex) + sizeof(std::string)
93+
+ sizeof(std::shared_mutex) + sizeof(std::string)
8894
+ sizeof(method_set) + sizeof(bool) * 8,
8995
"http_resource grew beyond the documented PRD-REQ-REQ-003 / "
9096
"TASK-051 / TASK-058 step 3 envelope (method_set + per-route "
9197
"hook PIMPL slot + Allow header cache)");
9298

9399
// Belt-and-suspenders: regardless of v1 anchoring, v2.0 must be no
94100
// larger than v1 + the documented hook PIMPL slot + the TASK-058 step 3
95-
// Allow-header cache payload.
101+
// Allow-header cache payload (upgraded to std::shared_mutex for
102+
// security-reviewer-iter1-2 / performance-reviewer-iter1-1 fix).
96103
static_assert(sizeof(http_resource) <=
97104
V1_HTTP_RESOURCE_SIZEOF + sizeof(void*) * 2
98-
+ sizeof(std::mutex) + sizeof(std::string)
105+
+ sizeof(std::shared_mutex) + sizeof(std::string)
99106
+ sizeof(method_set) + sizeof(bool) * 8,
100107
"http_resource grew beyond v1 + TASK-051 hook PIMPL slot + "
101108
"TASK-058 step 3 Allow cache payload");

0 commit comments

Comments
 (0)