Express.js Clean-Room Reconstruction

A seven-step experiment that clones Express.js, deletes its test suite, rebuilds tests from source code alone, reimplements the library from those tests alone, builds an app on the result, and finally rebuilds the library a second time from pure parametric memory — each step blind to the artifacts of the one it replaces.

The Experiment

Step 1 — Baseline (`1-original-repo/`)

Pristine clone of expressjs/express. Ran npm test to establish ground truth.

1249 tests, all passing
98.75% statement coverage, 96.16% branch, 100% functions, 99.61% lines

Step 2 — Derive Tests from Source (`2-clean-room-rebuild-tests/`)

Cloned Express again, deleted test/, and wrote a new test suite from scratch. The only inputs were lib/ source code, README.md, and examples/. The original tests in 1-original-repo/test/ were never opened.

Built in 6 loops: infrastructure, then application, then four parallel subagent passes covering request (18 files), response (21 files), router/middleware (8 files), and acceptance tests (18 files modeled on examples/).

667 tests, all passing
95.12% stmts, 86.70% branch, 97.24% funcs, 96.36% lines

Step 3 — Reimplement lib/ from Tests (`3-reimplement-from-derived-specs/`)

Cloned Express a third time, deleted both lib/ and test/, copied the derived tests from step 2, and reimplemented all 6 lib/ files using only the tests as specification. Never looked at 1-original-repo/lib/ or 2-clean-room-rebuild-tests/lib/.

Implemented in dependency order: utils.js → view.js → request.js → response.js → application.js → express.js.

661/667 passing on first run (99.1%)
Fixed 4 bugs (view lookup crash, format duplicate branch, jsonp array callback, 204 Content-Length leak)
667/667 tests passing, 94.63% stmts, 86.75% branch, 96.58% funcs, 96.13% lines

Structural comparison against the original source showed the reimplementation converged on nearly identical architecture — same factory pattern, same lazy router, same middleware init, same settings system — with minor divergences in implementation details (regex vs stdlib, computed property vs middleware for query parsing, real Symbol vs string-based fake symbol).

Step 4 — Improve Coverage (`4-improve-coverage/`)

Added 29 targeted tests for uncovered lines/branches identified via istanbul reports.

Metric	Before	After
Stmts	94.63%	97.68%
Branch	86.75%	93.15%
Funcs	96.58%	98.29%
Lines	96.13%	98.97%

Techniques: socket manipulation, header/property deletion, raw prototype access, env override, direct View constructor calls, mount-based trust proxy inheritance.

Step 5 — Demo App (`5-demo-app/`)

Built "Clipstash" — a code snippet manager with browser UI and JSON API — on top of the reimplemented Express to verify it works as a real library, not just under unit tests.

Exercises ~30 Express features: express(), app.set/listen/engine/locals, express.json/urlencoded/static, express.Router(), router.param(), req.query/body/params/get/is/ip/xhr/cookies/path/method, res.json/send/render/redirect/status/sendStatus/sendFile/cookie/clearCookie/set/type/format/links/vary/append/locals, EJS templates with partials, content negotiation on 4 endpoints, and 4-arg error handling.

cd 5-demo-app && node app.js
# http://localhost:3000

No npm install required — it requires Express directly from ../4-improve-coverage/.

Step 6 — Parametric Memory Rebuild, with Iteration (`6-parametric-memory/`)

A different question: what if there are no derived tests, no source, no web — just the model's training-data memory? Cloned the project from step 4 (which keeps the test suite), deleted lib/, and reimplemented all 6 files from pure recall. Allowed to iterate using npm test failures, but never to read source, README, examples, or web.

696/696 tests passing after iteration
~2683 lines vs ~2762 in the original — within 3%

Step 7 — Parametric Memory Rebuild, One-Shot (`7-blind-from-memory/`)

Same setup as step 6, but stricter: write all 6 lib/ files in a single pass with no testing between them, then run npm test exactly once and report the score. No fixing.

663/696 tests passing on first attempt (95.3%)
~1658 lines — about 60% of the original
33 failures concentrated in: res.download/res.attachment filename handling (~10), res.status validation (6), missing utils.methods export (3), cascading acceptance test 500s (5), and small gaps in res.send, app.path(), app.router deprecation, query parser default, and req.host X-Forwarded-Host parsing

The architecture comes back faithful — lazy router init, settings prototype chain, etag/wetag helpers, createApplication factory — but edge-case validation and deprecation paths get silently dropped without test feedback to surface them.

Results Summary

Step	What	Tests	Stmt Coverage
1	Original Express	1249	98.75%
2	Derived test suite	667	95.12%
3	Reimplemented lib/	667	94.63%
4	+ coverage tests	696	97.68%
5	Demo app	—	works in browser + curl
6	Parametric rebuild (iterating)	696/696	—
7	Parametric rebuild (one-shot)	663/696	—

Key Findings

Tests-as-spec works. A test suite derived from source code was sufficient to reimplement Express from scratch with 99.1% first-run pass rate. The reimplementation converged on the same architecture without ever seeing the original code.

What tests capture: Public API contracts, routing behavior, middleware composition, content negotiation, error propagation, header semantics, cookie handling, template rendering, trust proxy logic.

What tests miss: Deprecation warnings, error message wording, internal implementation choices (regex vs net.isIP()), race condition handling (onFinished in sendfile), dead code (View.prototype.resolve()).

Tests can drive better design. The reimplementation's query as a computed getter (vs middleware that parses once) and settings prototype chain walk (vs direct lookup) are arguably improvements — emergent from what the tests require rather than how the original happened to implement it.

Parametric memory gets you 95% of the way; iteration closes the rest. Steps 6 and 7 ask whether the model can rebuild Express from training data alone. One-shot from memory passes 663/696 (95.3%) — the architecture is faithful but ~5% of behavior (validation strictness, deprecation paths, a few edge cases) gets silently dropped. With test-failure iteration the same blind setup converges on 696/696. The signal isn't memory or tests in isolation; it's a feedback loop that grounds memory.

Directory Layout

full-ralph/
├── 1-original-repo/          # Pristine Express clone (baseline)
├── 2-clean-room-rebuild-tests/  # Express + derived test suite
├── 3-reimplement-from-derived-specs/  # Tests + reimplemented lib/
├── 4-improve-coverage/        # Above + 29 targeted coverage tests
├── 5-demo-app/                # Clipstash app on the reimplementation
│   ├── app.js
│   ├── routes/
│   │   ├── snippets.js        # HTML CRUD + router.param()
│   │   └── api.js             # JSON API + content negotiation
│   ├── views/                 # EJS templates
│   └── public/                # Static assets
├── 6-parametric-memory/       # lib/ rebuilt from model memory (with iteration)
├── 7-blind-from-memory/       # lib/ rebuilt from model memory (one-shot, no iteration)
├── blind-rebuild.md           # Prompt for step 6
├── blind-from-memory.md       # Prompt for step 7
├── CLAUDE.md                  # Detailed step-by-step log
└── README.md                  # This file

Running

# Step 1: Original baseline
cd 1-original-repo && npm test

# Step 2: Derived tests against original lib
cd 2-clean-room-rebuild-tests && npm test

# Step 3: Derived tests against reimplemented lib
cd 3-reimplement-from-derived-specs && npm test

# Step 4: Extended tests against reimplemented lib
cd 4-improve-coverage && npm test

# Step 4 with coverage
cd 4-improve-coverage && npm run test-cov

# Step 5: Demo app
cd 5-demo-app && node app.js
# Then: curl http://localhost:3000/api/snippets
#   or: open http://localhost:3000 in a browser

# Step 6: Parametric memory rebuild (iterated to 696/696)
cd 6-parametric-memory && npm test

# Step 7: Parametric memory rebuild (one-shot, 663/696)
cd 7-blind-from-memory && npm test

Tooling

Test framework: Mocha + supertest + assert
Coverage: Istanbul (nyc)
Template engine: EJS
Built with: Claude Code (claude-opus-4-6) using ralph-loop for multi-step orchestration and parallel subagents for test generation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Express.js Clean-Room Reconstruction

The Experiment

Step 1 — Baseline (`1-original-repo/`)

Step 2 — Derive Tests from Source (`2-clean-room-rebuild-tests/`)

Step 3 — Reimplement lib/ from Tests (`3-reimplement-from-derived-specs/`)

Step 4 — Improve Coverage (`4-improve-coverage/`)

Step 5 — Demo App (`5-demo-app/`)

Step 6 — Parametric Memory Rebuild, with Iteration (`6-parametric-memory/`)

Step 7 — Parametric Memory Rebuild, One-Shot (`7-blind-from-memory/`)

Results Summary

Key Findings

Directory Layout

Running

Tooling

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
1-original-repo		1-original-repo
2-clean-room-rebuild-tests		2-clean-room-rebuild-tests
3-reimplement-from-derived-specs		3-reimplement-from-derived-specs
4-improve-coverage		4-improve-coverage
5-demo-app		5-demo-app
6-parametric-memory		6-parametric-memory
7-blind-from-memory		7-blind-from-memory
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Express.js Clean-Room Reconstruction

The Experiment

Step 1 — Baseline (1-original-repo/)

Step 2 — Derive Tests from Source (2-clean-room-rebuild-tests/)

Step 3 — Reimplement lib/ from Tests (3-reimplement-from-derived-specs/)

Step 4 — Improve Coverage (4-improve-coverage/)

Step 5 — Demo App (5-demo-app/)

Step 6 — Parametric Memory Rebuild, with Iteration (6-parametric-memory/)

Step 7 — Parametric Memory Rebuild, One-Shot (7-blind-from-memory/)

Results Summary

Key Findings

Directory Layout

Running

Tooling

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Step 1 — Baseline (`1-original-repo/`)

Step 2 — Derive Tests from Source (`2-clean-room-rebuild-tests/`)

Step 3 — Reimplement lib/ from Tests (`3-reimplement-from-derived-specs/`)

Step 4 — Improve Coverage (`4-improve-coverage/`)

Step 5 — Demo App (`5-demo-app/`)

Step 6 — Parametric Memory Rebuild, with Iteration (`6-parametric-memory/`)

Step 7 — Parametric Memory Rebuild, One-Shot (`7-blind-from-memory/`)

Packages