Skip to content

test(setup): experimental end-to-end setup-flow matrix (all ecosystems, workspaces, monorepo)#98

Merged
Mikola Lysenko (mikolalysenko) merged 3 commits into
mainfrom
test/setup-flow-matrix
Jun 2, 2026
Merged

test(setup): experimental end-to-end setup-flow matrix (all ecosystems, workspaces, monorepo)#98
Mikola Lysenko (mikolalysenko) merged 3 commits into
mainfrom
test/setup-flow-matrix

Conversation

@mikolalysenko
Copy link
Copy Markdown
Collaborator

@mikolalysenko Mikola Lysenko (mikolalysenko) commented Jun 2, 2026

What

A non-blocking, experimental end-to-end test matrix that verifies the intended socket-patch setup flow for every supported ecosystem/package manager, including nested workspaces and a polyglot monorepo:

  1. prepare a project with a dependency + a committed patch set → 1. socket-patch setup → 2. native install → 3. check whether the patch was applied.

No CLI/source behavior changes — purely test infrastructure (tests/, scripts/, two Dockerfiles, one CI job, a new setup-e2e feature).

Why aspirational + non-blocking

setup only configures npm-family install hooks today, so the non-npm *_with_setup cases are expected to fail. The suite encodes the ideal end state and records a per-case baseline of what works now, so each gap flips known_gap → pass automatically (no test edits) as setup grows support. The CI job is continue-on-error: true and must be left out of required status checks.

Coverage (114 cases)

Declarative in tests/setup_matrix/matrix.json (single source of truth for both the runner and the Rust wrappers):

  • Package managers: npm, yarn, pnpm, bun · pip, uv, poetry, pdm, hatch · cargo · bundler · go · mvn · composer · dotnet · deno
  • Layouts (SM_LAYOUT):
    • single — one project, one dependency (the 16-PM grid).
    • workspace — a nested workspace: root + several members (incl. a deeply-nested one and a member that doesn't use the patched package). Exercises setup's workspace handling (npm/yarn hook every member; pnpm only the root) + the cross-workspace apply on a root install. PMs: npm, pnpm, yarn (apply) and pip, uv (Python gaps).
    • monorepo — a polyglot all-ecosystem repo: an npm workspace alongside python/rust/go/php/ruby/nuget/deno manifests. Confirms setup works in a mixed environment — configures the npm hooks, doesn't choke on the foreign manifests; a root npm install patches the npm slice.
  • Scenarios: baseline_with_setup, plus two ablation controls that confirm setup is correct (identical to the passing case except one removed factor, each must run UNPATCHED): no_setup_control (setup ablated) and patch_missing (committed patch ablated). Also empty_patchset, wrong_target_patchset, alt_content_patchset. Workspace/monorepo carry the matching ablation pair.

Results (verified in Docker, all 9 images on socket-patch 3.3.0)

Ecosystem PMs setup+install applies?
npm npm/yarn/pnpm/bun (single and workspace) ✅ yes
monorepo npm slice in a polyglot repo ✅ yes
pypi · cargo · gem · golang · maven · composer · nuget · deno (the rest) known_gap
  • 0 regressions, 0 errors, 0 leaks. Every native install genuinely ran (install_exit=0).
  • All ablation/negative controls pass (run unpatched). Paired proof for single npm: baseline_with_setup → applied; no_setup_control & patch_missing → unpatched.
  • Existing docker_e2e_npm suite still passes against the extended images; clippy clean on all setup-matrix test targets.

How it works

  • tests/setup_matrix/run-case.sh — self-contained, layout-aware bash flow driver (scaffold → setup → install → verify → JSON). Generates npx/pnpm shims inline so the hook resolves to the binary under test (no registry fetch); apply runs offline+force against the committed .socket/ fixture (no Socket API).
  • scripts/setup-matrix.sh — CLI/agent-friendly orchestrator: build / run / list / query / results. Classifies pass/known_gap/progress/regression; exits non-zero only on regression.
  • crates/socket-patch-cli/tests/setup_matrix_*.rs (+ shared module) — thin Rust wrappers gated by a new setup-e2e feature; assert the aspirational ideal (red on gaps); Docker or SOCKET_PATCH_TEST_HOST=1.
  • Dockerfiles: Dockerfile.npm (+pnpm/yarn via corepack) and Dockerfile.pypi (+uv/poetry/pdm/hatch), additively; existing tests unaffected.
  • See tests/setup_matrix/README.md.

Try it

scripts/setup-matrix.sh build --ecosystem npm
scripts/setup-matrix.sh run   --ecosystem npm                 # npm family → pass
scripts/setup-matrix.sh run   --scenario workspace_with_setup # nested workspaces
scripts/setup-matrix.sh run   --scenario monorepo_with_setup  # polyglot monorepo
scripts/setup-matrix.sh run   --scenario patch_missing        # ablation → unpatched
scripts/setup-matrix.sh query --status known_gap              # what setup still can't do

What's still missing in setup (the gaps this maps out)

  • composer — closest win: composer.json has post-install-cmd, directly analogous to npm; setup just needs to write it.
  • pip/uv/poetry/pdm/hatch, cargo, gem, deno — no usable post-install hook (Python needs a .pth/wrapper; deno install ignores the root postinstall).
  • golang/maven/nuget — need a hook and integrity-sidecar repair (go.sum, .jar.sha1, .nupkg.sha512).
  • Cross-cutting: extend PackageManager beyond Npm/Pnpm, detect non-package.json manifests, stop hardcoding --ecosystems npm in the written hook.

Drive-by findings (not fixed here)

  • apply --force is the one SOCKET_* bool flag without BoolishValueParser, so SOCKET_FORCE=1 errors; the harness uses SOCKET_FORCE=true.
  • pnpm runs the root postinstall on pnpm install but not pnpm add (npm/yarn/bun run it on add).
  • In a workspace the install hook's apply must use the package manager's per-script cwd (member dirs no-op, root applies); pinning SOCKET_CWD to the root breaks npm install mid-run.

🤖 Generated with Claude Code

Adds a non-blocking, data-driven test matrix that verifies the intended
`socket-patch setup` flow end to end for every supported
ecosystem/package manager:

  0. prepare a project with a dependency + a committed patch set
  1. run `socket-patch setup` to configure install hooks
  2. run the native install command for the package manager
  3. check whether the patch was applied (marker on disk)

plus negative controls (no setup, empty/wrong-target/alt patch sets).

The suite is ASPIRATIONAL and intentionally non-blocking: `setup` only
configures npm-family hooks today, so non-npm `baseline_with_setup`
cases are expected `known_gap`s — a baseline of what `setup` must
eventually support. Results are classified against a recorded baseline;
the runner exits non-zero only on a regression.

Components:
- tests/setup_matrix/matrix.json — declarative cases (targets x scenarios),
  the single source of truth for both the runner and the Rust wrappers.
- tests/setup_matrix/run-case.sh — self-contained bash flow driver
  (scaffold -> setup -> install -> verify -> JSON); generates the
  npx/pnpm shims inline so the hook resolves to the local binary instead
  of fetching the published wrapper.
- scripts/setup-matrix.sh — orchestrator (build/run/list/query/results),
  classifies pass/known_gap/progress/regression, emits machine-readable
  JSON.
- crates/socket-patch-cli/tests/setup_matrix_<eco>.rs (+ shared module),
  gated by a new `setup-e2e` feature; assert the aspirational ideal.
- tests/docker/Dockerfile.{npm,pypi} extended additively (pnpm/yarn via
  corepack; uv/poetry/pdm/hatch) — existing docker_e2e tests unaffected.
- ci.yml: a `setup-matrix` job, `continue-on-error: true` (must stay out
  of required checks).

No CLI/source behavior changes. Verified in Docker on all 9 images
(socket-patch 3.3.0): 80 cases, 56 pass / 24 known_gap, 0 regression /
0 error; npm/yarn/pnpm/bun apply, everything else is a documented gap;
all negative controls pass (no leaks).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extends the setup-flow matrix with an `SM_LAYOUT` dimension modelling
real-world deployments beyond a single project:

- workspace (npm, pnpm, yarn, pip, uv): a root + several members,
  including a deeply-nested member and one with no dependency on the
  patched package. Exercises `setup`'s workspace handling — npm/yarn
  write the hook to every member, pnpm only to the root — and the
  cross-workspace apply on a single root install. npm/pnpm/yarn apply
  (the dependency hoists / lands in the pnpm store and is patched once);
  pip (nested requirements) and uv (uv workspace, one shared .venv) are
  Python gaps.
- monorepo: a polyglot repo with an npm workspace alongside
  python/rust/go/php/ruby/nuget/deno manifests. Confirms `setup` works in
  a mixed environment — it configures the npm hooks and does not choke on
  the foreign manifests; a root `npm install` then patches the npm slice.
  Runs in the npm image; the foreign manifests are present to test
  setup's robustness, not installed.

Wiring: `matrix.json` gains workspace_targets/scenarios and
monorepo_targets/scenarios; `run-case.sh` gains layout-aware scaffold /
install / multi-target verification; `scripts/setup-matrix.sh` threads a
`layout` column (+ `query --layout`); the Rust harness gains
`run_workspace_pm` / `run_monorepo`, with `*_workspace` tests on the
npm/pypi wrappers and a new `setup_matrix_monorepo.rs`.

Real-world finding (and fix in the harness): the install hook's `apply`
must run with the package manager's per-script cwd — root for the
project, the member dir for each member — so member postinstalls find no
manifest and no-op while the root applies. The driver therefore does NOT
pin SOCKET_CWD; pinning it to the root makes every member apply target
the root manifest and fail mid-install with "no packages found on disk",
breaking `npm install` in a workspace.

Verified in Docker (socket-patch 3.3.0): npm/pnpm/yarn workspace and the
monorepo apply (pass); pip/uv workspace are known_gap; single-project
cases unchanged. 92 cases total; 0 regressions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a `patch_missing` ablation (new `patchset: none` — no `.socket/`
fixture committed) to the single, workspace and monorepo scenario sets,
complementing the existing setup-not-run controls. Together they are the
controls that confirm `setup` is correct: each is identical to the
corresponding `*_with_setup` case except for the single removed factor
(the setup step, or the committed patch), and each must run UNPATCHED.

So every "it applies" case is now flanked by both ablations, e.g. for
single npm:
  baseline_with_setup  -> applied   (patch + setup)
  no_setup_control     -> unpatched (setup ablated)
  patch_missing        -> unpatched (patch ablated)

`run-case.sh` skips the fixture entirely for `patchset: none` (so the
hook's apply finds no manifest and no-ops — distinct from `empty`, where
the manifest exists but lists zero patches). No orchestrator/Rust changes
needed; the scenarios are data-driven and picked up automatically.

Matrix grows to 114 cases. Verified in Docker (3.3.0): all 22
patch_missing cases pass (run unpatched) — single 16/16, workspace 5/5,
monorepo 1/1; with_setup cases still apply; 0 regressions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mikolalysenko Mikola Lysenko (mikolalysenko) changed the title test(setup): experimental end-to-end setup-flow matrix across all ecosystems test(setup): experimental end-to-end setup-flow matrix (all ecosystems, workspaces, monorepo) Jun 2, 2026
@mikolalysenko Mikola Lysenko (mikolalysenko) merged commit 5c30c8c into main Jun 2, 2026
52 checks passed
@mikolalysenko Mikola Lysenko (mikolalysenko) deleted the test/setup-flow-matrix branch June 2, 2026 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants