Skip to content

Commit ca11b1a

Browse files
etrclaude
andcommitted
TASK-032 review cleanup: address 5 majors + 18 minors from unworked review
Major fixes (all 5 addressed): - Replaced bare catch(...) on unregister_path with catch(invalid_argument) + catch(exception) with stderr logging — surface unexpected exceptions. - Added 5 ms inter-request sleep to curl client loop — rate-limits to ~200 req/s under TSan, keeping wall-clock within CI budget. - Added null check after curl_easy_init() — skip thread gracefully on resource exhaustion rather than crash. - Added early return after fork() failure (LT_CHECK(child >= 0)) — prevents waitpid(-1,...) from reaping unrelated processes. - TASK-032.md status already Done; assertions already individual (TASK-052). Minor fixes (cosmetic + docs): - Added kDynSlots, kIpRange, kExitCodeStopReturned, kExitCodeCurlCompleted named constants; replaced all hex magic literals (0x7, 0xff, 42, 43). - Reduced CURLOPT_TIMEOUT_MS in stop-from-handler child to 3000L (< parent's 5 s window) so timeout ordering is consistent. - Renamed deadline → child_deadline in stop_from_handler test. - Replaced std::string(run) with std::string_view(run) for env-var check. - Added upper bound (v <= 3600) to stress_seconds() (CWE-1284). - Added comment documenting why register_prefix is intentionally excluded. - Added @see stop() cross-reference to ~webserver() Doxygen. - Updated specs/architecture/09-testing.md §9 item 6 to use v2 API names. - Corrected DR-008.md Verification: exit codes are regression sentinels, not positive observations; WIFSIGNALED paths are the positive observations. - Added ~webserver() note to DR-008.md Verification section. - Added PRD-CON-REQ-001/002 EARS concurrency requirements to product_specs.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 17bfe16 commit ca11b1a

5 files changed

Lines changed: 60 additions & 22 deletions

File tree

specs/architecture/09-testing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,6 @@ The architecture itself does not prescribe test frameworks (out of architecture
77
3. **Move semantics on `http_response`** (DR-5): sanitizer-clean tests for inline↔inline, inline↔heap, heap↔inline, heap↔heap on both move-construct and move-assign.
88
4. **SBO size invariant** (DR-5): `static_assert(sizeof(detail::deferred_body) <= http_response::body_buf_size, ...)` at the end of `detail/body.hpp`. Compile-time guarantee.
99
5. **Routing semantics preservation** (DR-7): the v1 routing-test corpus runs against v2.0 unchanged. Any regression is treated as a release-blocker.
10-
6. **Thread-safety contract** (DR-8): a stress test exercises `register_resource` / `block_ip` from within handlers, verifies no deadlock except for the documented `stop()` case.
10+
6. **Thread-safety contract** (DR-8): a stress test exercises `register_path` / `unregister_path` / `block_ip` / `unblock_ip` from within handlers, verifies no deadlock except for the documented `stop()` case. An opt-in negative test (`stop_from_handler_deadlocks_as_documented`, enabled via `HTTPSERVER_RUN_STOP_FROM_HANDLER=1`) confirms the deadlock contract by forking a child that calls `stop()` from inside a handler.
1111

1212
---

specs/architecture/11-decisions/DR-008.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,6 @@
2222
**Verification (TASK-032):**
2323
- `test/integ/threadsafety_stress.cpp` — stress test binary `threadsafety_stress` runs 16 concurrent curl clients for 60 seconds (override via `HTTPSERVER_STRESS_SECONDS=N`), each request randomly invoking `register_path`, `unregister_path`, `block_ip`, and `unblock_ip` against the running `webserver` from inside a handler thread. A duplicate-registration from a handler throws `std::invalid_argument` rather than causing a data race.
2424
- CI coverage: the `build-type: tsan` matrix entry in `.github/workflows/verify-build.yml` compiles with `-fsanitize=thread` and runs `make check`, which automatically picks up `threadsafety_stress` as a registered `check_PROGRAMS` entry — no separate workflow wiring is needed.
25-
- Opt-in negative test `stop_from_handler_deadlocks_as_documented` (enabled via `HTTPSERVER_RUN_STOP_FROM_HANDLER=1`) forks a child process that calls `stop()` from inside a handler. A non-zero child exit (libmicrohttpd self-join abort, exit code 42) or a 5-second parent timeout (exit code 43) both count as positive observation of the documented deadlock contract. These exit codes serve as regression sentinels.
25+
- Opt-in negative test `stop_from_handler_deadlocks_as_documented` (enabled via `HTTPSERVER_RUN_STOP_FROM_HANDLER=1`) forks a child process that calls `stop()` from inside a handler. Positive observations of the contract are signal-terminated (libmicrohttpd self-join abort, `WIFSIGNALED`) or SIGKILL-from-parent after a 5-second timeout (silent deadlock, also `WIFSIGNALED`). Two sentinel exit codes mark regressions: the unreachable-after-`stop()` line inside the child (`stop()` returned from a handler — regression) and the post-curl line (`stop()` did not block at all — regression). A zero child exit means `stop()` returned cleanly, which is also a regression against DR-008. `~webserver()` carries the same threading constraint as `stop()` but is not separately exercised because destructor invocation from a running handler is inherently process-unsafe; the `stop()` test is considered representative.
2626

2727
---

specs/product_specs.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,9 @@
3636
- **Hot-path performance:** Per-request getters shall not allocate or copy containers; they return `const&` or `string_view`.
3737
- **Naming:** All public method names shall be snake_case; one canonical verb per concept.
3838
- **Documentation:** v2.0 ships with a rewritten `README` and an updated examples set. A short `RELEASE_NOTES.md` summarizes the API changes for users porting from v1; it is informational, not a compatibility commitment.
39+
- **Concurrency (DR-008):**
40+
- `PRD-CON-REQ-001` When a caller invokes any `webserver` public method from a handler thread (except `stop()` or `~webserver()`), the system shall complete the call without deadlock or data race.
41+
- `PRD-CON-REQ-002` When a caller invokes `stop()` or `~webserver()` from inside a handler thread, the system shall either deadlock (join waits indefinitely) or abort (libmicrohttpd self-join detection); it shall not return successfully.
3942

4043
---
4144

src/httpserver/webserver.hpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,8 @@ class webserver {
162162
* Destroy the webserver from the thread that constructed it.
163163
*
164164
* See specs/architecture/11-decisions/DR-008.md and §5.1.
165+
*
166+
* @see stop() for the threading constraints that apply equally here.
165167
**/
166168
~webserver();
167169
// PIMPL-owned: copy/move would slice the backing impl object.

test/integ/threadsafety_stress.cpp

Lines changed: 53 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,16 @@ namespace ht = httpserver;
8181

8282
namespace {
8383

84+
// Named constants — replaces hex magic literals throughout.
85+
// kDynSlots: number of competing dynamic path slots (slot = i & (kDynSlots-1)).
86+
// kIpRange: number of distinct test IPs (ip suffix = i & (kIpRange-1)).
87+
// kExitCodeStopReturned: child exit when stop() returns from a handler (regression).
88+
// kExitCodeCurlCompleted: child exit when curl completes without abort (regression).
89+
constexpr int kDynSlots = 8;
90+
constexpr int kIpRange = 256;
91+
constexpr int kExitCodeStopReturned = 42; // stop() returned — should not happen
92+
constexpr int kExitCodeCurlCompleted = 43; // curl completed — stop() did not block
93+
8494
// Discard libcurl response bodies — we only care about completing the
8595
// round-trip, not the body content.
8696
size_t discard_write(char* /*ptr*/, size_t size, size_t nmemb,
@@ -206,15 +216,19 @@ ht::http_response driver_body(const ht::http_request& req,
206216
// exercise.
207217
}
208218

209-
const int slot = i & 0x7;
219+
const int slot = i & (kDynSlots - 1);
210220
const std::string dyn_path =
211221
"/dyn/" + std::to_string(slot);
212222
const std::string ip =
213-
"198.51.100." + std::to_string(i & 0xff);
223+
"198.51.100." + std::to_string(i & (kIpRange - 1));
214224

215225
// Six ops total: 0..3 are the TASK-032 route-table / ban-list
216226
// mutators; 4..5 are TASK-052's hook bus churn. `op % 6` keeps
217227
// the distribution roughly uniform.
228+
// NOTE: register_prefix / unregister_prefix are intentionally not
229+
// exercised here because they share the same lock path as
230+
// register_path / unregister_path (register_impl_ with family=true
231+
// vs false); the existing cases already cover the mutex contention.
218232
switch (op % 6) {
219233
case 0:
220234
try {
@@ -231,7 +245,13 @@ ht::http_response driver_body(const ht::http_request& req,
231245
ws->unregister_path(dyn_path);
232246
counters->unregister_ok.fetch_add(
233247
1, std::memory_order_relaxed);
234-
} catch (...) {
248+
} catch (const std::invalid_argument&) {
249+
// Path not registered yet — expected race, not a bug.
250+
} catch (const std::exception& e) {
251+
// Surface unexpected exceptions so they are visible in
252+
// test logs (e.g. bad_alloc, logic_error).
253+
std::cerr << "[stress] unexpected unregister_path exception: "
254+
<< e.what() << '\n';
235255
}
236256
break;
237257
case 2:
@@ -290,11 +310,12 @@ ht::http_response driver_body(const ht::http_request& req,
290310

291311
// Stress duration: default 60 s (acceptance criterion), overridable
292312
// via HTTPSERVER_STRESS_SECONDS for fast local iteration.
313+
// Capped at 3600 s to prevent runaway in CI (CWE-1284).
293314
int stress_seconds() {
294315
if (const char* s = std::getenv("HTTPSERVER_STRESS_SECONDS")) {
295316
try {
296317
int v = std::stoi(s);
297-
if (v > 0) return v;
318+
if (v > 0 && v <= 3600) return v;
298319
} catch (...) {
299320
}
300321
}
@@ -349,6 +370,7 @@ LT_BEGIN_AUTO_TEST(threadsafety_stress_suite,
349370
// Per-thread curl handle: curl_easy_* is per-handle
350371
// thread-safe; each thread owns its handle.
351372
CURL* curl = curl_easy_init();
373+
if (!curl) return; // resource exhaustion — skip this thread
352374
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, discard_write);
353375
// 198.51.100.0/24 is TEST-NET-2 (RFC 5737) — block/unblock
354376
// ops on these IPs cannot blacklist the loopback driver
@@ -361,12 +383,17 @@ LT_BEGIN_AUTO_TEST(threadsafety_stress_suite,
361383
while (!stop.load(std::memory_order_relaxed)) {
362384
// Six ops: 0..3 route-table/ban; 4..5 hook bus (TASK-052).
363385
const int op = static_cast<int>(rng() % 6u);
364-
const int i = static_cast<int>(rng() & 0xff);
386+
const int i = static_cast<int>(rng() & (kIpRange - 1));
365387
const std::string url =
366388
base + "?op=" + std::to_string(op) +
367389
"&i=" + std::to_string(i);
368390
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
369391
curl_easy_perform(curl);
392+
// 5 ms inter-request sleep to rate-limit each thread to
393+
// ~200 req/s. Under TSan (5–20× slower) this keeps total
394+
// lock pressure and shadow-memory churn within the CI
395+
// wall-clock budget without reducing lock-path coverage.
396+
std::this_thread::sleep_for(std::chrono::milliseconds(5));
370397
}
371398
curl_easy_cleanup(curl);
372399
});
@@ -415,7 +442,7 @@ LT_BEGIN_AUTO_TEST(threadsafety_stress_suite,
415442
// and is opt-in because reproducing the deadlock requires _Exit()
416443
// to escape the wedged process.
417444
const char* run = std::getenv("HTTPSERVER_RUN_STOP_FROM_HANDLER");
418-
if (run == nullptr || std::string(run) != "1") {
445+
if (run == nullptr || std::string_view(run) != "1") {
419446
std::cout << "[SKIP] stop_from_handler_deadlocks_as_documented"
420447
" — set HTTPSERVER_RUN_STOP_FROM_HANDLER=1 to run\n";
421448
return;
@@ -442,6 +469,7 @@ LT_BEGIN_AUTO_TEST(threadsafety_stress_suite,
442469
// a handler, contradicting the documented contract.
443470
const pid_t child = fork();
444471
LT_CHECK(child >= 0);
472+
if (child < 0) return; // fork failed; waitpid(-1,...) would reap unrelated processes
445473
if (child == 0) {
446474
// Child: trigger the contract violation. We do not care about
447475
// the child's stdout — silence it so the test log stays
@@ -459,9 +487,9 @@ LT_BEGIN_AUTO_TEST(threadsafety_stress_suite,
459487
// thread → DR-008 unsafe path.
460488
ws.stop();
461489
// Below is unreachable. If reached, the contract is
462-
// broken — exit with a distinctive code (42) so the
463-
// parent can flag the regression.
464-
std::_Exit(42);
490+
// broken — exit with a distinctive code so the parent
491+
// can flag the regression.
492+
std::_Exit(kExitCodeStopReturned);
465493
return ht::http_response::string("unreachable");
466494
});
467495

@@ -473,26 +501,28 @@ LT_BEGIN_AUTO_TEST(threadsafety_stress_suite,
473501
CURL* curl = curl_easy_init();
474502
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
475503
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, discard_write);
476-
curl_easy_setopt(curl, CURLOPT_TIMEOUT_MS, 10000L);
504+
// 3 s < parent's 5 s deadline: curl's window expires before the
505+
// parent SIGKILLs the child, keeping the two timeouts ordered.
506+
curl_easy_setopt(curl, CURLOPT_TIMEOUT_MS, 3000L);
477507
curl_easy_perform(curl);
478508
curl_easy_cleanup(curl);
479509

480510
// If we ever reach here, stop()-from-handler did NOT abort
481-
// or deadlock as documented → regression. Exit 43 so the
482-
// parent flags it distinctly from the unreachable-after-
483-
// stop() case above.
484-
std::_Exit(43);
511+
// or deadlock as documented → regression. Use a distinct
512+
// sentinel so the parent can flag it separately from the
513+
// unreachable-after-stop() case above.
514+
std::_Exit(kExitCodeCurlCompleted);
485515
}
486516

487517
// Parent: bounded wait on the child. SIGKILL it after 5 s if it
488518
// is still running (the silent-deadlock branch of DR-008). Any
489519
// outcome OTHER than a zero exit is a positive observation of
490-
// the contract; a zero exit (or codes 42/43) is a regression.
520+
// the contract; a zero exit (or sentinel codes) is a regression.
491521
int status = 0;
492-
auto deadline =
522+
auto child_deadline =
493523
std::chrono::steady_clock::now() + std::chrono::seconds(5);
494524
bool reaped = false;
495-
while (std::chrono::steady_clock::now() < deadline) {
525+
while (std::chrono::steady_clock::now() < child_deadline) {
496526
pid_t r = ::waitpid(child, &status, WNOHANG);
497527
if (r == child) {
498528
reaped = true;
@@ -513,9 +543,12 @@ LT_BEGIN_AUTO_TEST(threadsafety_stress_suite,
513543
<< " (MHD self-join detection)\n";
514544
} else if (WIFEXITED(status)) {
515545
const int code = WEXITSTATUS(status);
516-
// Codes 42/43 mark regressions seeded inside the child.
517-
// Zero exit means stop() returned cleanly from a handler.
518-
LT_CHECK(code != 0 && code != 42 && code != 43);
546+
// kExitCodeStopReturned / kExitCodeCurlCompleted mark
547+
// regressions seeded inside the child. Zero exit means
548+
// stop() returned cleanly from a handler.
549+
LT_CHECK(code != 0 &&
550+
code != kExitCodeStopReturned &&
551+
code != kExitCodeCurlCompleted);
519552
std::cout << "[OK] stop_from_handler — child exited with "
520553
"code " << code
521554
<< " (non-zero exit on contract violation)\n";

0 commit comments

Comments
 (0)