test(cli): drain PTY stdout before resolving E2E `exit` (fixes flaky help test) by tobyhede · Pull Request #537 · cipherstash/stack

tobyhede · 2026-07-01T00:16:01Z

What

Fixes an intermittent failure in the CLI node-pty E2E suite where the captured PTY output buffer is only partially populated, so assertions on r.output see a truncated buffer (e.g. runner-aware-help.e2e.test.ts asserting pnpm dlx stash init but seeing only the env banner).

Fixes #536

Root cause

In packages/cli/tests/helpers/pty.ts the exit promise resolved directly on pty.onExit. onExit is not ordered after the final onData — node-pty (and the underlying OS pipe) can deliver the exit event while trailing stdout is still in flight. Tests do await r.exit then immediately read r.output (a getter over the raw buffer accumulated in onData), so on a loaded runner the trailing help text has not landed in raw when the assertion reads it.

This is a nondeterministic timing flake, not a regression: the same commit re-run produced Node 24 fail→pass and Node 22 pass→fail — the failure hops matrix nodes on identical code. The latent helper race dates to a4532a56 (original node-pty suite); the observed failure was introduced by 0560049 (runner-aware help E2E coverage).

Fix

Resolve exit only after the stdout stream has drained: on onExit, capture the exit event, then wait until no further data has arrived for a short settle window (SETTLE_MS = 50, reset by each new chunk) bounded by a hard cap (DRAIN_CAP_MS = 2000) so a chatty process can never hang the suite. A bare setImmediate/microtask is insufficient because pipe data may still be in flight after the exit event. The { exitCode, signal? } return shape is unchanged, so no per-test changes are needed.

Blast radius resolved

Single-file helper fix; every suite that does await r.exit then reads output benefits:

Test file	output reads	waitFor uses
`smoke.e2e.test.ts`	16	0
`runner-aware-help.e2e.test.ts`	5	0
`doctor.e2e.test.ts`	4	0
`database-url.e2e.test.ts`	5	1
`auth-login-cancel.e2e.test.ts`	1	1

Verification

Ran locally (macOS, darwin-arm64, Node 24; node-pty prebuild available):

pnpm --filter "stash..." build — success (JS + DTS).
tests/e2e/runner-aware-help.e2e.test.ts — 5 consecutive runs, 8/8 passing each.
tests/e2e/smoke.e2e.test.ts — 3 consecutive runs, 7/7 passing each.
Full tests/e2e/ suite — 6 files / 23 tests passing.
Biome clean on the changed file.

Changeset

None — test-infra only, does not affect published stash output.

Summary by CodeRabbit

Bug Fixes
- Quieted environment-file loading to prevent dotenv from emitting the “injected env” banner during CLI runs.
- Stabilized --help output so help text and runner-labeled examples remain consistent across package-manager runner scenarios.
Tests
- Updated E2E help and smoke coverage to assert absence of dotenv banner text.
- Switched runner-aware help tests to a new non-interactive CLI run(...) helper that captures complete output reliably in CI.

The E2E helper's `exit` promise resolved directly on `pty.onExit`, which is not ordered after the final `pty.onData`. On a loaded runner the OS pipe can deliver the exit event while trailing stdout is still in flight, so tests that do `await r.exit` then read `r.output` intermittently observe a truncated buffer (the flaky runner-aware-help failure, which hopped across Node 22/24 matrix nodes on identical code). Resolve `exit` only after the stdout stream has drained: capture the exit event, then wait until no further data has arrived for a short settle window (50ms, reset by each new chunk) bounded by a 2s hard cap so a chatty process can never hang the suite. A bare microtask is insufficient because pipe data may still be in flight after the exit event. Fixes the latent race for all E2E suites (smoke, runner-aware-help, doctor, database-url, auth-login-cancel) without per-test changes. Fixes #536

changeset-bot · 2026-07-01T00:16:06Z

⚠️ No Changeset found

Latest commit: aedfd9d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-07-01T00:16:25Z

📝 Walkthrough

Walkthrough

Adds a pipe-based run() helper for non-interactive CLI tests, migrates runner-aware help tests to use it, suppresses dotenv banner output during CLI startup, and adds a smoke assertion that the banner text is absent.

Changes

CLI test runner helper and dotenv banner fix

Layer / File(s)	Summary
New non-PTY run() test helper `packages/cli/tests/helpers/run.ts`	Defines `RunOptions` and `RunResult`, documents `run()`, and implements pipe-based CLI execution with captured raw and stripped output.
Runner-aware help tests use run() `packages/cli/tests/e2e/runner-aware-help.e2e.test.ts`	Switches the runner-aware help and auth E2E cases from PTY `render()` to `run()` and asserts `exitCode` directly.
Suppress dotenv banner output `packages/cli/src/bin/main.ts`, `packages/cli/tests/e2e/smoke.e2e.test.ts`	Passes `quiet: true` to each dotenv load and adds a smoke assertion that the help output does not contain `injected env`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

cipherstash/stack#370: Both PRs touch the CLI E2E harness and smoke output handling around dotenv banner noise.
cipherstash/stack#384: Both PRs modify runner-aware-help.e2e.test.ts and its runner-specific output assertions.

Suggested reviewers

calvinbrewer

A rabbit hops on a pipe-lined trail,
Quiet env banners now fail to hail.
Help text shines, clean and bright,
Tests hop through with exit codes right.
🐇

🚥 Pre-merge checks | ✅ 1 | ❌ 4

❌ Failed checks (4 warnings)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The title matches a test-flakiness fix, but the shown changes mostly alter dotenv logging and test helpers instead of draining PTY stdout.	Rewrite the title to describe the actual change, or update the PR to implement the PTY stdout-drain fix it claims.
Linked Issues check	⚠️ Warning	[`#536`] The PR does not implement the PTY exit-drain fix; it adds a pipe-based run helper and suppresses dotenv banners instead.	Update packages/cli/tests/helpers/pty.ts so exit resolves after stdout drains, preserving the existing API and avoiding per-test rewrites.
Out of Scope Changes check	⚠️ Warning	Several changes are unrelated to `#536`, including dotenv quieting in main.ts and switching tests to the new run helper.	Limit the PR to the PTY drain fix, or document these test-helper and dotenv changes as part of the intended scope.
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/cli-e2e-pty-drain

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/cli/tests/helpers/pty.ts`:
- Around line 139-153: The settle logic in the pty exit handler is anchored only
to lastDataAt, so the Promise in the onExit callback can resolve immediately if
output was already quiet before the exit event. Update the exit-handling flow in
the pty helper to record the onExit timestamp (or compute from max(lastDataAt,
exitAt)) and base the quietFor/grace period on that value so late in-flight
output still gets the full settle window. Keep the fix localized to the
onExit/check logic and the DRAIN_CAP_MS/SETTLE_MS timing path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 448e418d-5bfa-4e84-84ab-f60676871946

📥 Commits

Reviewing files that changed from the base of the PR and between 93fc5f9 and db98dfc.

📒 Files selected for processing (1)

packages/cli/tests/helpers/pty.ts

The runner-aware `--help` E2E test flaked in CI, always dropping the tail of the help (the Examples section, where `bunx stash init` / `bunx stash db install` live) while the earlier Usage line survived. Root cause is a Linux-only node-pty behaviour, not a race: `--help` emits ~5KB in a single `console.log` and the process exits immediately. When the pty slave closes with data still unread, Linux discards the pty buffer, so under CI load node-pty never delivers the tail — `onData` fires for the head only. macOS ptys retain that buffer, which is why it never reproduced locally. A post-exit "drain" wait (the previous approach on this branch) cannot help: the bytes are gone before `onData` is ever called. Fix: add a pipe-based `run()` helper (`child_process.spawn`, piped stdio) for non-interactive command assertions. Pipes buffer in-process and deliver every byte before `close`, so large bursts are never truncated on any platform. Migrate `runner-aware-help.e2e.test.ts` to it. The PTY `render()` helper stays for interactive tests, and its speculative drain change is reverted. Also pass `quiet: true` to dotenv's `config()` calls: v17 prints an `injected env (N) from …` banner to stdout on every call, so the CLI was emitting four noisy, non-deterministic banner lines ahead of its own output on every invocation. A smoke-test guard keeps the banner from returning.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

packages/cli/tests/helpers/run.ts (1)
39-80: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Consider a hard timeout to prevent hangs, mirroring the pty.ts fix.

If the spawned command hangs, this promise never resolves and the child process is never killed — it leaks until vitest's own test timeout. The PR's own rationale for pty.ts's exit fix explicitly calls for "a hard timeout to prevent hanging"; the same defensive cap (with child.kill() on timeout) would give run() parity for chatty/stuck non-interactive commands.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/cli/tests/helpers/run.ts` around lines 39 - 80, The run() helper can
hang forever if the spawned CLI never exits, since the Promise only resolves on
child close and never force-kills the process. Add a hard timeout inside run()
around the spawn()ed child, and on expiry call child.kill() before rejecting or
resolving with a timeout result so stuck non-interactive commands cannot leak
until the test runner times out. Keep the change scoped to run() and preserve
the existing stdout/stderr capture and close/error handling.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/cli/tests/helpers/run.ts`:
- Around line 66-78: The Run helper in run.ts is treating signal-terminated
children as successful by defaulting `exitCode` to 0 in the `child.on('close')`
handler. Update the `RunResult` construction so `exitCode` reflects only an
actual numeric exit code and does not mask a non-null `signal`; use the `signal`
field from the close event as the indicator of termination. Keep the fix
localized to the `child.on('close', (code, signal) => ...)` logic and the
`RunResult` shape so tests using `r.exitCode` can distinguish success from crash
or kill.

---

Nitpick comments:
In `@packages/cli/tests/helpers/run.ts`:
- Around line 39-80: The run() helper can hang forever if the spawned CLI never
exits, since the Promise only resolves on child close and never force-kills the
process. Add a hard timeout inside run() around the spawn()ed child, and on
expiry call child.kill() before rejecting or resolving with a timeout result so
stuck non-interactive commands cannot leak until the test runner times out. Keep
the change scoped to run() and preserve the existing stdout/stderr capture and
close/error handling.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e339d247-3bd5-44df-a7f7-c84255214828

📥 Commits

Reviewing files that changed from the base of the PR and between db98dfc and c88d9d2.

📒 Files selected for processing (4)

packages/cli/src/bin/main.ts
packages/cli/tests/e2e/runner-aware-help.e2e.test.ts
packages/cli/tests/e2e/smoke.e2e.test.ts
packages/cli/tests/helpers/run.ts

✅ Files skipped from review due to trivial changes (1)

packages/cli/tests/e2e/smoke.e2e.test.ts

`child.on('close')` gives `code === null` when the child was killed by a signal. Defaulting `exitCode` to `code ?? 0` reported a crash/kill as a clean exit, so `expect(r.exitCode).toBe(0)` could pass over a SIGSEGV/SIGKILL. Keep `exitCode` as the real numeric code (or null on signal) and widen the type to `number | null` so tests can tell success from a crash.

The runner-aware `--help` E2E test flaked in CI, always dropping the tail of the help (the Examples section with `bunx stash init` / `bunx stash db install`) while the earlier Usage line survived. Root cause is a Linux-only node-pty behaviour, not a race: `--help` emits ~5KB in a single `console.log` and the process exits immediately. When the pty slave closes with data still unread, Linux discards the pty buffer, so under CI load node-pty never delivers the tail — `onData` fires for the head only. macOS ptys retain that buffer, which is why it never reproduced locally. Fix: add a pipe-based `run()` helper (`child_process.spawn`, piped stdio) for non-interactive command assertions. Pipes buffer in-process and deliver every byte before `close`, so large bursts are never truncated on any platform. `run()` keeps `exitCode` as the real numeric code (null on signal) so a crash can't masquerade as a clean exit. Migrate `runner-aware-help.e2e.test.ts` to it; the PTY `render()` helper stays for interactive tests. Also pass `quiet: true` to dotenv's `config()` calls: v17 prints an `injected env (N) from …` banner to stdout on every call, so the CLI was emitting four noisy, non-deterministic banner lines ahead of its own output on every invocation. A smoke-test guard keeps the banner from returning. Mirrors the fix on fix/cli-e2e-pty-drain (#537) so this branch's CI is green independently.

auxesis

Review

Verdict: COMMENT — no blocking issues. This is a tight, well-scoped PR: it adds quiet: true to the four dotenv config() calls to silence dotenv v17's injected env banner, and introduces a pipe-based run() E2E helper to replace flaky PTY capture for non-interactive commands. Both changes are sensible and the production change (main.ts) is directly asserted by the new smoke check. The findings below are all test-quality gaps — none point at a runtime bug.

After de-duplication, 2 distinct findings survived verification (one per source; they describe different issues, so there is no cross-model corroboration). A few additional verified-but-secondary test gaps are listed below rather than posted inline.

Review stats

Source	Raw	Survived
codex (gpt-5.5) [test-gap]	1	1
claude (claude-opus-4-8) [test-gap]	1	1

Cross-model overlap: 0 kept findings were corroborated by 2+ models — the two findings are distinct (dotenv-banner assertion vacuity vs. run() non-zero exit coverage).
Dropped: 0.

Additional findings not posted inline

These are verified but lower-severity; folded here to keep inline noise down (all single-source: claude, substantiated against the code):

run()'s signal / exitCode: null path is untested. The helper's headline behaviour (commit aedfd9d: keep exitCode null on a signal-killed child instead of masking it as 0) has no test. It's hard to reach through the current API — run() always spawns STASH_BIN, exposes no handle to signal the child, and (unlike its sibling runPiped) has no timeoutMs/SIGKILL escape hatch to force a null code. Consider extracting the code → exitCode mapping for a direct unit test, or driving a self-terminating child.
RunResult.stderr is never asserted independently. Every consumer reads only the combined output; if the stderr pipe wiring regressed (e.g. both on('data') handlers appended to the same buffer) no test would catch it. A failure-path test asserting error text lands in r.stderr would lock the channel split down.
run() substantially duplicates runPiped() in tests/helpers/spawn-piped.ts — both pipe-spawn STASH_BIN with stdio: ['ignore','pipe','pipe'], accumulate stdout/stderr separately, and return exitCode: number | null. run() adds ANSI stripping + combined output/raw + CI: 'true', but drops the timeout/SIGKILL guard runPiped carries (spawn-piped.ts:61-64), so a run()-invoked CLI that ever hangs would stall until vitest's own timeout rather than failing fast. Not a coverage issue — flagging so the author can decide whether to converge the two helpers or port the guard over.

(No crypto/security-relevant changes in this diff.)

auxesis · 2026-07-01T03:29:33Z

+    // `config()` call unless `quiet: true` is passed. Guard against that
+    // noise regressing back into the CLI's stdout (it also destabilises the
+    // PTY output capture these e2e tests rely on).
+    expect(r.output).not.toContain('injected env')


[single-source: codex] This regression assertion passes vacuously — it never exercises the dotenv-injection branch it means to guard. dotenv only prints the injected env (N) from … banner when it actually loads a file. There is no .env / .env.* anywhere in the repo (confirmed), and this test spawns the CLI in the default cwd, so dotenv injects nothing and the banner never appears — with or without quiet: true. not.toContain('injected env') therefore cannot fail even if the quiet: true flags in main.ts were reverted, so it doesn't actually protect against the regression.

To exercise the branch, run the CLI in a temp cwd that contains a .env:

const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'stash-dotenv-quiet-')) try { fs.writeFileSync(path.join(tmpDir, '.env'), 'STASH_DOTENV_QUIET_TEST=1\n') const r = await run(['--version'], { cwd: tmpDir }) expect(r.exitCode).toBe(0) expect(r.output).toContain(pkg.version) expect(r.output).not.toContain('injected env') } finally { fs.rmSync(tmpDir, { recursive: true, force: true }) }

That fails if any config() call regresses to dotenv's default noisy behaviour. (Using the new run() helper here also sidesteps the PTY drain flakiness this PR is otherwise moving away from.)

auxesis · 2026-07-01T03:29:33Z

+      res({
+        // `code` is null when the child was terminated by `signal`; keep it
+        // null rather than masking a kill/crash as a clean 0 exit.
+        exitCode: code,


[single-source: claude] run()'s non-zero exit-code propagation has no test — only exit-0 paths are exercised. Both run() consumers in runner-aware-help.e2e.test.ts assert exitCode === 0 (lines 33, 53), and every exit-1 case in the smoke suite still uses render. So this exitCode: code propagation — the whole point of the helper (commit aedfd9d: don't mask a signal-terminated child as a clean 0) — is uncovered. Classic lopsided coverage: the success exit is tested, the failure exit is not.

A drop-in e2e mirroring the existing unknown top-level command smoke case, but through run():

it('surfaces a non-zero exit code + error output for an unknown command', async () => { const r = await run(['definitely-not-a-command']) expect(r.exitCode).toBe(1) expect(r.output).toContain(messages.cli.unknownCommand) expect(r.output).toContain('definitely-not-a-command') })

Passes today; fails if run() ever regresses to coercing a non-zero/null code to 0 or drops the stream carrying the error text.

auxesis

LGTM, thanks for this @tobyhede.

I have left some automated review comments that should be addressed, but otherwise good to merge.

freshtonic · 2026-07-01T03:32:15Z

Review

What it does: Fixes a flaky CLI E2E test where PTY output capture truncated the --help text on loaded CI runners. The final approach (the title/body describe an earlier "drain" attempt that was superseded by later commits) is a new pipe-based run() helper in tests/helpers/run.ts that captures non-interactive command output over plain pipes instead of a PTY, migrates runner-aware-help.e2e.test.ts to it, silences the dotenv v17 injected env banner via quiet: true, and adds a regression assertion. This is a clean, well-documented test-infra change.

The code itself is solid — run() correctly preserves null exit codes for signal-terminated children, orders env overrides properly, and combines stdout+stderr sensibly for .toContain assertions. The help renderer doesn't wrap on terminal width, so the pipe/non-TTY switch doesn't change the asserted output.

Findings

packages/cli/tests/e2e/smoke.e2e.test.ts:16 — Fix is incomplete: the smoke --help test still reads the same ~5KB help burst over a PTY, so it remains exposed to the exact flake this PR diagnoses.

The PR's root-cause analysis is that a PTY drops the tail of a large --help burst on a loaded Linux runner. But only runner-aware-help.e2e.test.ts was migrated to run(). smoke.e2e.test.ts's --help test still uses render() (PTY) and asserts expect(r.output).toContain('db install') — content that sits deep in the same help output. On the same loaded runner, db install can be truncated and this test flakes identically. (The newly-added not.toContain('injected env') assertion is safe since the banner is emitted first, but the toContain assertions on trailing help text are not.) Consider migrating the smoke --help/--version/usage assertions to run() too, or noting explicitly why smoke is intentionally left on the PTY path.

Nothing else surfaced — the run() helper is correct and the dotenv quiet change is appropriate.

🤖 Generated with Claude Code

freshtonic

This feedback from @auxesis should be addressed, otherwise LGTM.

tobyhede requested a review from a team as a code owner July 1, 2026 00:16

coderabbitai Bot reviewed Jul 1, 2026

View reviewed changes

Comment thread packages/cli/tests/helpers/pty.ts Outdated

coderabbitai Bot reviewed Jul 1, 2026

View reviewed changes

Comment thread packages/cli/tests/helpers/run.ts

tobyhede requested review from auxesis and freshtonic July 1, 2026 02:41

auxesis reviewed Jul 1, 2026

View reviewed changes

auxesis approved these changes Jul 1, 2026

View reviewed changes

freshtonic requested changes Jul 1, 2026

View reviewed changes

Uh oh!

Conversation

tobyhede commented Jul 1, 2026 • edited by freshtonic Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Root cause

Fix

Blast radius resolved

Verification

Changeset

Summary by CodeRabbit

Uh oh!

changeset-bot Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (4 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

auxesis left a comment

Choose a reason for hiding this comment

Review

Review stats

Additional findings not posted inline

Uh oh!

auxesis Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

auxesis Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

auxesis left a comment

Choose a reason for hiding this comment

Uh oh!

freshtonic commented Jul 1, 2026

Review

Findings

Uh oh!

freshtonic left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tobyhede commented Jul 1, 2026 •

edited by freshtonic

Loading

changeset-bot Bot commented Jul 1, 2026 •

edited

Loading

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading