Migrated from Method backlog
This issue was created from a legacy filesystem backlog card. GitHub Issues are now the live work tracker; repository docs remain Method evidence.
Source backlog: docs/method/backlog/cool-ideas/PLATFORM_wesley-gen-test-loop-speedup.md
Original lane: cool-ideas
Original legend: PLATFORM
Original backlog card
wesley-gen test loop speedup
Legend: PLATFORM
What hurts
Round-trip on a single small echo-wesley-gen integration change today
looks like this:
- A single integration test (
tests/generation.rs::test_*) routinely
takes 60–120s before it even reports pass/fail.
- A workspace
cargo build from cold ran 7m41s in the most recent
code-lawyer review pass.
- A single
git commit ran the pre-commit hook for 3m23s before
prettier triggered an abort-and-restage cycle, doubling the wait.
The compound effect: small mechanical PRs (one-line code review fixes)
take many minutes between intent and confirmation. The verification
loop is the bottleneck, not the code.
Where the time goes
Profile of a single tests/generation.rs::test_no_std_id_list_field_…
run:
- The test calls
Command::new("cargo").args(["run", "-p", "echo-wesley-gen", …]) once per case. cargo run re-checks
freshness across the whole dependency graph on every invocation,
even when the binary is current.
write_basic_generated_crate writes the generated module to
target/echo-wesley-gen-basic-smoke/<PID>/<label>/. The PID
segment guarantees the per-test crate cannot share build artifacts
with any other test, so echo-wasm-abi, echo-registry-api, and
serde are recompiled from scratch every single time.
assert_generated_crate_checks runs cargo check against that
per-PID crate. Because the crate has its own [workspace] block,
it cannot reuse the parent workspace's target directory either.
Pre-commit hook profile (most recent run):
verify-local cargo check -p echo-wesley-gen ran in 4m53s on a
cold cache. This re-checks the whole transitive dep graph for the
crate, not the staged subset.
prettier runs over the full set of staged markdown files. When it
has any reformatting to do, the hook aborts and the user must
git add -A and re-commit — paying the cargo cost twice.
Concrete wins (rank-ordered by yield/effort)
1. Share a target dir for generated consumer crates
Drop the <PID> segment from
target/echo-wesley-gen-basic-smoke/<PID>/<label>/ and add a
[workspace] exclusion that points the generated crate's
CARGO_TARGET_DIR at a shared cache (e.g.
target/echo-wesley-gen-basic-smoke-shared-target). echo-wasm-abi,
echo-registry-api, and serde then compile once across all
integration tests instead of once per test.
Expected delta: 60–120s per test → 5–15s per test on warm cache.
Risk: tests that mutate the same <label> directory must be serialized
or use unique labels. The existing labels are already unique by test
name, so this is mostly a matter of dropping the PID segment safely.
2. Pre-build the echo-wesley-gen binary once per test binary
Replace each test's Command::new("cargo").args(["run", "-p", "echo-wesley-gen", …]) with a OnceLock<PathBuf> that runs
cargo build -p echo-wesley-gen --release (or --profile dev) on
first use, captures the binary path, then invokes that binary
directly. Subsequent tests skip the freshness re-check entirely.
Expected delta: ~3–5s per test invocation, multiplied by ~20+
integration tests.
Risk: low. The OnceLock ensures the binary is up-to-date with the
current source on the first run; subsequent runs in the same test
binary reuse it. If the source changes between cargo test runs the
binary is rebuilt automatically.
3. Scope the pre-commit hook to staged crates
The hook currently runs cargo check -p echo-wesley-gen regardless of
what's staged. It should:
- Parse
git diff --cached --name-only for paths under crates/<X>/.
- For each unique
<X>, run cargo check -p <X> (not the whole
workspace) — and only if any source file in that crate is staged.
- Skip cargo entirely if only docs/markdown is staged.
Expected delta: 4m53s pre-commit → 0–60s for typical small PRs.
Risk: low for the obvious cases (single-crate edits, docs-only edits).
Cross-crate change sets still pay the full cost; that's correct.
4. cargo-nextest for integration tests
cargo-nextest runs each #[test] in its own process and parallelizes
them aggressively. The wesley-gen integration suite is structurally
parallel (each test compiles into its own consumer crate, no shared
mutable state if win #1 lands with disciplined labels), so nextest
should drop wall-clock from sequential to roughly total / cores.
Expected delta: linear speedup with available cores on the suite that
currently dominates wall-clock time.
Risk: introduces a new dev-tool dependency. CI would also need to learn
about it. Worth doing after wins #1–#3, not before.
5. Don't re-run cargo work after a prettier abort
The current hook flow is: cargo verify → markdown lint → prettier
fixup → abort → user restages → cargo verify AGAIN (from scratch).
The prettier abort path should either:
- Skip the cargo verify on the retry if nothing under
crates/
changed between the two stagings, or
- Run prettier first (cheap) and only run cargo if markdown formatting
is already clean.
Expected delta: cuts the abort-cycle cost roughly in half for the
docs-touching commits that hit prettier.
Risk: requires re-ordering the hook script with care; the cargo gate
must still fire on any final pre-commit invocation that includes Rust
changes.
Why it matters now
This was identified during the PR #382 review-resolution pass, where
the verification loop dominated the actual work. Every additional
review thread or follow-up commit pays the same cost, so the loop
amortizes badly across iterative review work — which is exactly the
high-frequency case we want to optimize for.
Win #1 alone would be a step-change in interactive iteration speed.
Wins #1 + #2 + #3 are mechanical, low-risk, and individually
shippable; they don't depend on each other.
Suggested cycle shape
One cycle covering wins #1–#3 (target-dir sharing + binary
pre-build + scoped pre-commit) — all three are local to scripts and
test harness code, neither touches generated wire formats nor the
codec contract, so the determinism gates are not at risk. Win #4
(nextest) and win #5 (hook reorder) can ride as follow-ups once the
big rocks land.
Migrated from Method backlog
This issue was created from a legacy filesystem backlog card. GitHub Issues are now the live work tracker; repository docs remain Method evidence.
Source backlog:
docs/method/backlog/cool-ideas/PLATFORM_wesley-gen-test-loop-speedup.mdOriginal lane:
cool-ideasOriginal legend:
PLATFORMOriginal backlog card
wesley-gentest loop speedupLegend:
PLATFORMWhat hurts
Round-trip on a single small
echo-wesley-genintegration change todaylooks like this:
tests/generation.rs::test_*) routinelytakes 60–120s before it even reports pass/fail.
cargo buildfrom cold ran 7m41s in the most recentcode-lawyerreview pass.git commitran the pre-commit hook for 3m23s beforeprettier triggered an abort-and-restage cycle, doubling the wait.
The compound effect: small mechanical PRs (one-line code review fixes)
take many minutes between intent and confirmation. The verification
loop is the bottleneck, not the code.
Where the time goes
Profile of a single
tests/generation.rs::test_no_std_id_list_field_…run:
Command::new("cargo").args(["run", "-p", "echo-wesley-gen", …])once per case.cargo runre-checksfreshness across the whole dependency graph on every invocation,
even when the binary is current.
write_basic_generated_cratewrites the generated module totarget/echo-wesley-gen-basic-smoke/<PID>/<label>/. The PIDsegment guarantees the per-test crate cannot share build artifacts
with any other test, so
echo-wasm-abi,echo-registry-api, andserdeare recompiled from scratch every single time.assert_generated_crate_checksrunscargo checkagainst thatper-PID crate. Because the crate has its own
[workspace]block,it cannot reuse the parent workspace's target directory either.
Pre-commit hook profile (most recent run):
verify-local cargo check -p echo-wesley-genran in 4m53s on acold cache. This re-checks the whole transitive dep graph for the
crate, not the staged subset.
prettierruns over the full set of staged markdown files. When ithas any reformatting to do, the hook aborts and the user must
git add -Aand re-commit — paying the cargo cost twice.Concrete wins (rank-ordered by yield/effort)
1. Share a target dir for generated consumer crates
Drop the
<PID>segment fromtarget/echo-wesley-gen-basic-smoke/<PID>/<label>/and add a[workspace]exclusion that points the generated crate'sCARGO_TARGET_DIRat a shared cache (e.g.target/echo-wesley-gen-basic-smoke-shared-target).echo-wasm-abi,echo-registry-api, andserdethen compile once across allintegration tests instead of once per test.
Expected delta: 60–120s per test → 5–15s per test on warm cache.
Risk: tests that mutate the same
<label>directory must be serializedor use unique labels. The existing labels are already unique by test
name, so this is mostly a matter of dropping the PID segment safely.
2. Pre-build the
echo-wesley-genbinary once per test binaryReplace each test's
Command::new("cargo").args(["run", "-p", "echo-wesley-gen", …])with aOnceLock<PathBuf>that runscargo build -p echo-wesley-gen --release(or--profile dev) onfirst use, captures the binary path, then invokes that binary
directly. Subsequent tests skip the freshness re-check entirely.
Expected delta: ~3–5s per test invocation, multiplied by ~20+
integration tests.
Risk: low. The
OnceLockensures the binary is up-to-date with thecurrent source on the first run; subsequent runs in the same test
binary reuse it. If the source changes between
cargo testruns thebinary is rebuilt automatically.
3. Scope the pre-commit hook to staged crates
The hook currently runs
cargo check -p echo-wesley-genregardless ofwhat's staged. It should:
git diff --cached --name-onlyfor paths undercrates/<X>/.<X>, runcargo check -p <X>(not the wholeworkspace) — and only if any source file in that crate is staged.
Expected delta: 4m53s pre-commit → 0–60s for typical small PRs.
Risk: low for the obvious cases (single-crate edits, docs-only edits).
Cross-crate change sets still pay the full cost; that's correct.
4. cargo-nextest for integration tests
cargo-nextestruns each#[test]in its own process and parallelizesthem aggressively. The wesley-gen integration suite is structurally
parallel (each test compiles into its own consumer crate, no shared
mutable state if win #1 lands with disciplined labels), so nextest
should drop wall-clock from sequential to roughly
total / cores.Expected delta: linear speedup with available cores on the suite that
currently dominates wall-clock time.
Risk: introduces a new dev-tool dependency. CI would also need to learn
about it. Worth doing after wins #1–#3, not before.
5. Don't re-run cargo work after a prettier abort
The current hook flow is: cargo verify → markdown lint → prettier
fixup → abort → user restages → cargo verify AGAIN (from scratch).
The prettier abort path should either:
crates/changed between the two stagings, or
is already clean.
Expected delta: cuts the abort-cycle cost roughly in half for the
docs-touching commits that hit prettier.
Risk: requires re-ordering the hook script with care; the cargo gate
must still fire on any final pre-commit invocation that includes Rust
changes.
Why it matters now
This was identified during the PR #382 review-resolution pass, where
the verification loop dominated the actual work. Every additional
review thread or follow-up commit pays the same cost, so the loop
amortizes badly across iterative review work — which is exactly the
high-frequency case we want to optimize for.
Win #1 alone would be a step-change in interactive iteration speed.
Wins #1 + #2 + #3 are mechanical, low-risk, and individually
shippable; they don't depend on each other.
Suggested cycle shape
One cycle covering wins #1–#3 (target-dir sharing + binary
pre-build + scoped pre-commit) — all three are local to scripts and
test harness code, neither touches generated wire formats nor the
codec contract, so the determinism gates are not at risk. Win #4
(nextest) and win #5 (hook reorder) can ride as follow-ups once the
big rocks land.