feat: implement advisor plans 001-006 (security, deps, tests, refactors)#158
Conversation
Change-Id: I3a53be8ace00ec7df33a6e995ff82d373001c20f Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
…ones Add a [tasks.audit] mise task (aube audit --audit-level high) and wire it into both the [tasks.ci] chain and the linux-static CI job (after Lint). This fails the build on any future high/critical dependency advisory. Clear the 7 current advisories (3 high, 3 moderate, 1 low) by adding package.json overrides forcing patched transitive versions: brace-expansion 5.0.6, esbuild 0.28.1, vite 8.0.16, ws 8.21.0. aube audit now reports 0 vulnerabilities at all severities. Change-Id: Iff1b8f8043963786d348dd87939dab7b3df865bc Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
Plan 001: restrict the per-Home socket directory to 0o700, the bound socket file to 0o600, and persisted state files (manifests, homes.json) to 0o600, regardless of umask. Adds an integration test asserting the permission bits and that the owner can still drive the session. Plan 004: export hostMain's pure decision helpers (normalizeExitSignal, isSessionCommandable, assertSessionCommandable, resolveHostRendererName) and add characterization unit tests plus an idle-timeout auto-exit integration test. No logic changes to hostMain. Change-Id: I4d8f8425e631752cf00ce653bd5654f8e86e230e Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
Plan 005: extract the replay-event seq/ordering invariants into a single tested helper (src/renderer/replayEvents.ts) and use it from both renderer backends' replayTo so they iterate replay events through one shared path. Plan 006: move the embedded harness HTML and the harness-decoding layer out of the ghostty-web backend god file into sibling modules (embeddedHarnessHtml.ts, harnessDecoding.ts), cutting backend.ts from 2798 to ~1620 lines. Pure move-and-reimport; the three externally-consumed decode symbols are re-exported from backend.ts to keep importers resolving. Change-Id: Ib38d070fcb1289bc78556b8d1100e5cc9b4578a6 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
001 harden local-state permissions (socket 0o700, state files 0o600) 002 CI dependency-audit gate + clear 7 advisories 003 RELEASE.md support contract version-agnostic 004 characterize hostMain decision helpers + idle-timeout 005 share replay-event iteration across renderer backends 006 extract harness HTML + decoding from ghostty-web backend Change-Id: Ife225ce75822896f8c7826639dc6d9ed2e00b64f Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Iacec0ce02995b6b64254c68d3c9301a52f7c7f89 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
The remaining two plan documents, continued from the pull request description: 005-dedupe-replay-event-iteration.mdPlan 005: Both renderer backends iterate replay events through one shared, tested helper
Status
Why this mattersThe two renderer backends — Current stateBoth
for (const event of input.events) {
assertNonNegativeInteger(
event.seq,
'replay event seq must be a non-negative integer',
);
invariant(
event.seq > previousEventSeq,
'replay events must be ordered by strictly increasing seq values',
);
previousEventSeq = event.seq;
if (event.seq <= this.lastAppliedSeq) {
continue;
}
if (event.seq > input.targetSeq) {
await flushOutputBatch();
break;
}
switch (event.type) {
case 'output': {
pendingOutputChunks.push(event.payload.data);
break;
}
case 'resize': {
await flushOutputBatch();
/* assert + resizeBridge + set cols/rows */ break;
}
case 'marker': {
await flushOutputBatch();
break;
}
case 'input_text':
case 'input_paste':
case 'input_keys':
case 'input_run':
case 'run_complete':
case 'signal':
case 'exit': {
await flushOutputBatch();
break;
}
default: {
unreachable(event, 'unsupported replay event type');
}
}
highestProcessedSeq = event.seq;
}
await flushOutputBatch(); // line 1664 — flushes any pending output after the loop
for (const event of input.events) {
assertNonNegativeInteger(event.seq, 'replay event seq must be non-negative');
invariant(
event.seq > previousEventSeq,
'replay events must be ordered by strictly increasing seq values',
);
previousEventSeq = event.seq;
if (event.seq <= this.lastAppliedSeq) {
continue;
}
if (event.seq > input.targetSeq) {
break;
}
switch (event.type) {
case 'output':
terminal.feed(event.payload.data);
break;
case 'resize':
/* assert + terminal.resize + set cols/rows */ break;
case 'marker':
case 'input_text':
case 'input_paste':
case 'input_keys':
case 'input_run':
case 'run_complete':
case 'signal':
case 'exit':
break;
default:
unreachable(event, 'unsupported replay event type');
}
highestProcessedSeq = event.seq;
}Both surround the loop with: Critical detail — Types and conventions
Design constraint (honor this)From Commands you will need
ScopeIn scope:
Out of scope (do NOT change behavior here):
Git workflow
StepsStep 1: Create the shared iteratorCreate import type { ReplayInput } from './types.js';
import { invariant } from '../util/assert.js';
type ReplayEvent = ReplayInput['events'][number];
/**
* Yield the replay events that fall in the half-open range
* (lastAppliedSeq, targetSeq], in order. Enforces the seq invariants shared by
* every renderer backend: each event seq is a non-negative integer, seqs are
* strictly increasing across ALL events (including skipped ones), events at or
* below lastAppliedSeq are skipped, and iteration stops at the first event
* beyond targetSeq. Callers dispatch on event.type and own how output is fed.
*/
export function* iterateInRangeReplayEvents(
input: ReplayInput,
lastAppliedSeq: number,
): Generator<ReplayEvent> {
let previousEventSeq = -1;
for (const event of input.events) {
invariant(
Number.isInteger(event.seq) && event.seq >= 0,
'replay event seq must be a non-negative integer',
);
invariant(
event.seq > previousEventSeq,
'replay events must be ordered by strictly increasing seq values',
);
previousEventSeq = event.seq;
if (event.seq <= lastAppliedSeq) {
continue;
}
if (event.seq > input.targetSeq) {
return;
}
yield event;
}
}Verify: Step 2: Use the helper in
|
| Purpose | Command | Expected |
|---|---|---|
| Typecheck | npm run typecheck |
exit 0 |
| Lint | npm run lint |
exit 0 |
| Format (fix) | npm run format |
exit 0 |
| Decode/backend unit tests | npx vitest run test/unit/renderer |
all pass |
| e2e (visual) | npm run test:e2e |
all pass |
| Line count | wc -l src/renderer/ghosttyWeb/backend.ts |
~1500 after |
Scope
In scope (create + modify):
src/renderer/ghosttyWeb/embeddedHarnessHtml.ts(create) — the HTML constant.src/renderer/ghosttyWeb/harnessDecoding.ts(create) — the snapshot interfaces,
decode helpers, generic assertion helpers, and validators listed above.src/renderer/ghosttyWeb/backend.ts(modify) — remove the moved symbols, add
imports, re-export the three externally-consumed names.
Out of scope (do NOT change in this plan):
- The
GhosttyWebBackendclass body and all its methods — especially the HTTP
server / bridge methods (startServer,respondToRequest,buildHarnessUrl,
isAllowedBrowserRequest,writeBridge,writeBatchBridge,resizeBridge,
readHarnessSnapshot, etc.). Extracting those is a deliberate follow-up. - The server/bridge interfaces and the
111–138constants (they belong with the
server methods that stay). - Any behavior change. This is a move; logic must be byte-identical.
src/renderer/libghosttyVt/backend.ts,src/export/webm.ts(only consume the
class, which doesn't move).CHANGELOG.md(automation-owned).
Git workflow
- Branch:
advisor/006-split-ghostty-web-backend - Conventional Commits. Example:
refactor: split harness HTML and decoding out of the ghostty-web backend.
One commit per step is fine. - Do NOT push or open a PR unless instructed.
Steps
Step 1: Extract the embedded harness HTML
-
Create
src/renderer/ghosttyWeb/embeddedHarnessHtml.tscontaining the moved
constant:export const EMBEDDED_HARNESS_HTML = `<!doctype html> …(the entire current value, verbatim)…`;
-
Delete the
const EMBEDDED_HARNESS_HTML = …;block (lines ~140–929) from
backend.ts. -
In
backend.ts, addimport { EMBEDDED_HARNESS_HTML } from './embeddedHarnessHtml.js';
(with the other relative imports).loadHarnessHtmlkeeps using the name
unchanged.
Verify: npm run typecheck → exit 0. wc -l src/renderer/ghosttyWeb/backend.ts
→ roughly 2020 lines (down ~790). npx vitest run test/unit/renderer → all pass.
Step 2: Extract the harness-decoding layer
- Create
src/renderer/ghosttyWeb/harnessDecoding.ts. Move into it, verbatim,
these symbols frombackend.ts:- Interfaces:
GhosttyHarnessVisibleLine,GhosttyHarnessSnapshotCell,
GhosttyHarnessRichLine,GhosttyHarnessSnapshot. GhosttyDecodedColumn,stripTrailingAsciiSpaces,assembleCanonicalLine.assertNonNegativeInteger,assertPositiveInteger,assertPositiveNumber,
assertHexColor,normalizeError.loadHarnessHtml,validateHarnessLines,validateHarnessSnapshotCells,
validateHarnessSnapshot.- Keep the existing
exportkeyword on whatever was already exported; export
everythingbackend.tswill need to import back.
- Interfaces:
- Add the new module's imports at its top:
EMBEDDED_HARNESS_HTMLfrom
./embeddedHarnessHtml.js, andinvariant/unreachable(whichever are used)
from../../util/assert.js. Letnpm run typechecktell you exactly which
util symbols and types are needed. - In
backend.ts, delete the moved blocks and add a single import from
./harnessDecoding.jsfor every moved symbol the class still references
(the validators, the assertion helpers used inreplayTo/screenshot, etc.).
Runnpm run typecheckand add/remove imports until it is clean.
Verify: npm run typecheck → exit 0; npx vitest run test/unit/renderer
→ all pass.
Step 3: Re-export the externally-consumed names and tidy
-
In
backend.ts, add a re-export so the existing decode test keeps resolving:export { assembleCanonicalLine, stripTrailingAsciiSpaces, } from './harnessDecoding.js'; export type { GhosttyDecodedColumn } from './harnessDecoding.js';
(Do not edit
test/unit/renderer/ghosttyWebDecode.test.ts— the re-export is
what keeps itsfrom '…/backend.js'import valid.) -
Run
npm run format, thennpm run lint→ both exit 0.
Verify: npx vitest run test/unit/renderer/ghosttyWebDecode.test.ts
→ all pass (proves the re-export works).
Step 4: Full behavior gate
npm run typecheck→ exit 0.npm run lint→ exit 0.npm run test:unit→ all pass.npm run test:e2e→ all pass (this exercises the real ghostty-web rendering /
screenshot path end-to-end; it is the proof the move changed no behavior). If
e2e cannot run in this environment, say so explicitly.
Test plan
This is a refactor with no new behavior, so the test plan is regression:
test/unit/renderer/ghosttyWebDecode.test.ts(decode helpers) — must pass
unchanged via the re-export.test/unit/renderer/ghosttyWebBackend.test.ts,canonicalScreen.test.ts, and
the rest oftest/unit/renderer/— must pass unchanged.npm run test:e2e— must pass unchanged (visual/screenshot parity).- No new test files are required. If you find a moved helper had zero
coverage and you want to add a focused unit test for it in
test/unit/renderer/, that's welcome but optional.
Done criteria
ALL must hold:
-
src/renderer/ghosttyWeb/embeddedHarnessHtml.tsand
src/renderer/ghosttyWeb/harnessDecoding.tsexist. -
grep -n "EMBEDDED_HARNESS_HTML = " src/renderer/ghosttyWeb/backend.ts
→ no match (constant moved out). -
wc -l src/renderer/ghosttyWeb/backend.ts→ roughly 1500 lines (down from 2814). -
npm run typecheck,npm run lint,npm run format:checkall exit 0. -
npm run test:unitexits 0, includingtest/unit/renderer/ghosttyWebDecode.test.ts
(proves the re-export). -
npm run test:e2epasses (or its inability to run here is reported). -
git diffshows only moves/imports/re-exports — no logic edits inside any
moved function, no change to theGhosttyWebBackendclass methods. - No
CHANGELOG.mdchange; no out-of-scope files modified (git status). -
plans/README.mdstatus row updated.
STOP conditions
Stop and report back (do not improvise) if:
- Moving a symbol creates a circular import that typecheck flags
(harnessDecoding.tsmust never import frombackend.ts). If a moved function
genuinely depends on something only the class has, leave that function in
backend.tsand report it. - Any
test/unit/renderer/*or e2e test fails after a move — that means a move
was not behavior-preserving (likely a missed import or an accidental edit).
Do not change the test; find the move error or report it. - The
GhosttyWebBackendclass needs edits beyond import lines to compile — that
signals you've moved something that should have stayed; report it. backend.tsdoes not end up substantially smaller (e.g. still > 1800 lines) —
re-check that both Step 1 and Step 2 actually removed their blocks.
Maintenance notes
- Deferred to a follow-up plan: extracting the HTTP server + browser-bridge
methods (startServer,respondToRequest, asset serving,buildHarnessUrl,
isAllowedBrowserRequest, the*Bridgemethods) into aserver.ts/ a bridge
helper. Those touchthis(theserver,serverOrigin,pagefields), so
they need dependency extraction or a small server class — higher risk, separate
change. This plan intentionally stops before that. - A reviewer should confirm the diff is move-only: no function body changed, and
the re-exports preserve the public import surface (ghosttyWebDecode.test.ts
and any other importer of the three decode symbols still resolve). - The two in-code comments that mention
EMBEDDED_HARNESS_HTML(the canonical-line
helpers note they must stay in sync with the harness copy) remain correct and
should be left as-is.
Summary
The six improvement plans produced by an earlier read-only audit have been implemented on this branch. Each plan was executed in an isolated git worktree, verified against its own gates, and reviewed against its scope and done-criteria before being integrated here. The plan documents are committed under
plans/and reproduced in collapsible sections below (and in a follow-up comment) so the exact steps can be reviewed.What changed, by plan
0o700, the socket file0o600, and persisted state files (session manifests and the Home Registry)0o600, so another local user can no longer connect to a session's control socket or read its state under a permissive umask. A Unix-guarded integration test asserts the modes and that the owning user is unaffected.aube audit --audit-level highstep is wired into themisecitask and thelinux-staticCI job, so high/critical advisories now fail the build instead of going unnoticed. The seven advisories present at audit time (all in transitive build/tooling dependencies) were cleared viapackage.jsonoverrides;aube auditnow reports zero vulnerabilities.RELEASE.mdsupport contract no longer claims a stale "0.2.x" release line; the framing was made version-agnostic so release automation will not re-stale it.hostMain.tsare now exported (with no logic change) and covered by unit tests, and an idle-timeout auto-exit integration test was added. Coverage of the per-session orchestration core went from a single assertion to thirteen cases plus the new integration test.iterateInRangeReplayEvents); both backends now delegate to it. The change is behavior-preserving.src/renderer/ghosttyWeb/backend.tsfrom 2814 to ~1620 lines. The move is byte-identical, and the externally-consumed decode helpers are re-exported so existing imports keep resolving.Verification
The integrated branch was run through the full gate: formatting (tracked files), lint, typecheck, workflow-lint, the new
aube auditgate (0 vulnerabilities), build, unit tests (1329 passing), e2e (33 passing), and the packaging smoke test — all pass. Integration tests are 187/188.The single integration failure —
screen-hash.test.tsasserting that structured and text snapshots agree onscreenHash— is pre-existing and unrelated to this change. It reproduces on the base commitc11e2e2with byte-identical hashes, and the renderer refactor (005/006) produces those same hashes, which confirms rendering behaviour was preserved. It looks like a latent divergence between how the structured and text snapshot hashes are derived and is worth tracking separately.Notes for review
Implementation plans 001–004
001-harden-local-state-permissions.md
Plan 001: agent-tty restricts its local socket and state files to the owning user
Status
c11e2e2, 2026-06-16Why this matters
agent-ttygives a caller full control of a real PTY: the RPC server acceptstype,paste,send-keys,run,resize, andsignal— i.e. arbitraryinput into your shell. That server listens on a Unix domain socket at a
deterministic, world-traversable path under
/tmp/agent-tty/, and thesocket and the session state files are created with no explicit permissions,
so they inherit the process umask. On a shared machine (multi-user dev box, a
shared CI runner) with a permissive umask, another local user can connect to the
socket and drive your session — effectively arbitrary command execution as you —
or read your session manifests and Home Registry. The default umask (
022)happens to block connecting (connect needs write on the socket file), but
relying on ambient umask for an authorization boundary is fragile. This plan
makes the boundary explicit: the per-Home socket directory becomes owner-only
(
0o700), the socket file and persisted state files become owner-only(
0o600), regardless of umask.Current state
Files involved:
src/host/hostMain.ts— per-session host entrypoint (runHost); creates thesocket directory just before the RPC server listens.
src/host/rpcServer.ts—RpcServer.listen()binds the Unix domain socket.src/storage/manifests.ts—writeTextFileAtomic, the single writer used forthe session manifest and the Home Registry (
homes.json).src/storage/sessionPaths.ts— builds the socket path/tmp/agent-tty/<sha256(home)[:8]>/<sha256(sessionId)[:12]>(read-only here;do not change path construction — it is already traversal-guarded).
Socket directory creation —
src/host/hostMain.tsaround line 1077 (insiderunHost,mkdiris already imported fromnode:fs/promiseson line 1):sPathis the socket path (const sPath = socketPath(sessDir);earlier inrunHost, ~line 143).dirname(sPath)is the per-Home socket directory. Themkdirpasses nomode, so the directory inherits the umask.Socket bind —
src/host/rpcServer.ts:190-229(server.listensets nopermissions on the created socket file):
this.socketPathis a private field set in the constructor.netis importedat the top of the file;
node:fs/promisesis not yet imported there.State-file writer —
src/storage/manifests.ts:100-120(nomodeonwriteFile, so manifests andhomes.jsoninherit the umask, typically0o644= world-readable):
mkdir,rename,rm,writeFileare imported fromnode:fs/promisesonline 2 of this file.
Conventions to follow
.jsextensions. Preferimport typefor type-only imports.invarianthelper (src/util/assert.ts) for preconditions;match the surrounding small-helper, explicit-control-flow style.
chmodis preferred over amkdirmodeoption becausemkdir'smodeis masked by the umask, but
chmodis not —chmodguarantees the finalmode. Octal literals like
0o700are the standard Node idiom.Design constraints (from CONTEXT.md / AGENTS.md — honor these)
per-Home socket directory holds one socket file per Session. Locking the
directory to
0o700is per-Home and is correct for all sessions in that Home.fspermission logic in command code — change it inwriteTextFileAtomic(the single manifest/registry writer) and in the host socket setup only.
Commands you will need
aube installnpm run typechecknpm run lintnpm run formatnpx vitest run test/integration/<file>.test.tsnpm run test:integrationScope
In scope (the only files you should modify):
src/host/hostMain.ts— chmod the socket directory after creating it.src/host/rpcServer.ts— chmod the socket file afterlisten()resolves.src/storage/manifests.ts— write state files withmode: 0o600.test/integration/(see Test plan).Out of scope (do NOT touch):
src/storage/sessionPaths.ts— path construction is already traversal-guardedwith
dirname(x) === rootinvariants. Do not change it.CHANGELOG.md— automation-owned (Communique/release-please). Never edit itin a feature change; a manual edit conflicts with
mainand breaks CI.do not apply; guard the new test to Unix only (see Test plan).
Git workflow
advisor/001-harden-local-state-permissionsExample from history:
fix: drop the component suffix from the release branch name.Use e.g.
fix: restrict agent-tty socket and state files to the owning user.Steps
Step 1: Lock the per-Home socket directory to
0o700In
src/host/hostMain.ts:chmodto the existingnode:fs/promisesimport on line 1(
import { chmod, mkdir } from 'node:fs/promises';).mkdir, add achmodof that directory to0o700:(If
sPath/dirname(sPath)is already bound to a local variable nearby,reuse it instead of recomputing — keep one
dirname(sPath)expression.)Verify:
npm run typecheck→ exit 0, no errors.Step 2: Lock the socket file to
0o600after bindIn
src/host/rpcServer.ts:import { chmod } from 'node:fs/promises';(place it with theother
node:imports at the top).listen(), immediately after theawait new Promise<void>(...)thatresolves when
server.listen(...)succeeds (i.e. after the try/catch thatbinds the socket, before the method returns), chmod the socket file:
Place this after the bind succeeds (the socket file does not exist until
listenresolves). Do not place it inside thecatch.Verify:
npm run typecheck→ exit 0. Thennpm run test:integration→ all pass (existing RPC/lifecycle integration tests still connect, because the
owner retains read/write).
Step 3: Write persisted state files as
0o600In
src/storage/manifests.ts, change thewriteFilecall inwriteTextFileAtomicto set an explicit mode:The mode survives the subsequent
renameto the final path (rename preservesthe inode and its mode). No other change in this function.
Verify:
npm run typecheck→ exit 0.Step 4: Format and full static check
Run
npm run formatthennpm run lint→ both exit 0.Test plan
Add a focused integration test that creates a session and asserts the
permission bits. Model it on an existing integration test that already spins up
a session against an isolated
AGENT_TTY_HOME— inspecttest/integration/for one that calls
createthendestroy(e.g.test/integration/gc.test.tsor a lifecycle test) and copy its setup/teardown shape (isolated temp home,
absolute
AGENT_TTY_HOME, never the real~/.agent-tty).New test file:
test/integration/socket-permissions.test.ts(or add a case tothe closest existing lifecycle integration test if the maintainer prefers).
Cover:
0o700: aftercreate, locate the per-Home socketdirectory under
/tmp/agent-tty/for the test's Home and assert(statSync(dir).mode & 0o777) === 0o700.0o600: assert the bound socket file's(mode & 0o777) === 0o600.0o600: aftercreate, assert the session manifest file's(mode & 0o777) === 0o600.runorinspectagainst thesession still succeeds (proves the tightened perms didn't lock out the owner).
Guard the whole suite to Unix: at the top,
if (process.platform === 'win32')skip (use vitest's
describe.skipIf(process.platform === 'win32')or an earlyit.skip). Mode bits are not meaningful on Windows.Verification:
npx vitest run test/integration/socket-permissions.test.ts→ all new cases pass.
Done criteria
ALL must hold:
npm run typecheckexits 0.npm run lintexits 0.npm run format:checkexits 0.npm run test:integrationexits 0; the new socket-permissions test existsand passes on this (Unix) machine.
grep -n "chmod" src/host/hostMain.ts src/host/rpcServer.tsshows the twonew chmod calls.
grep -n "mode: 0o600" src/storage/manifests.tsshows the manifest mode.git status).plans/README.mdstatus row updated.STOP conditions
Stop and report back (do not improvise) if:
Step 2 — that would mean the owner is being locked out or a non-owner path
exists you weren't told about. Do not loosen the mode to make it pass.
hostMain.ts:1077, orsPathis constructed differently than described.
mode: 0o600on the temp file changes behavior onrename(e.g. atest reads the manifest as a different user) — report rather than reverting to
default mode.
Maintenance notes
listenresolves (the file doesn't exist before then) and that the directory chmod
uses
chmod, not themkdirmodeoption (which umask would mask)./tmp/agent-tty/or makes socketsper-session-directory instead of per-Home, revisit the directory chmod.
/tmp/agent-tty/root mode(left at default; per-Home
0o700already prevents traversal into a Home'ssockets) and any audit-logging of rejected connections. Not needed for the
boundary this plan establishes.
002-dependency-audit-gate.md
Plan 002: CI fails on high-severity dependency advisories, and the current ones are cleared
Status
c11e2e2, 2026-06-16Why this matters
This repo has no dependency-advisory gate:
mise.tomlhas noaudittask and.github/workflows/ci.ymlnever audits. As of this writing,aube auditreports 7 advisories (3 high, 3 moderate, 1 low) that went unnoticed for
exactly that reason. None is a realistic exploit of the shipped CLI — they sit
in transitive build/tooling and non-attacker-facing runtime paths (e.g.
Playwright's own CDP WebSocket, build tools) — but they are trivial to clear and
should not be invisible. The durable win here is the gate: once CI runs
aube audit --audit-level high, any future high/critical advisory in thedependency tree fails the build instead of silently shipping.
Current state
The advisories, from running
aube auditat the repo root (you will re-run thisin Step 1 to get the live list):
server.fs.denybypass, Windows)Installed versions (from
aube-lock.yaml):esbuild@0.27.7,vite@8.0.11,ws@8.20.0,brace-expansion@5.0.5(the brace-expansion moderate ReDoS,GHSA-jxxr-4gwj-5jf2, also appears in the full audit). All are transitive —
none is listed directly in
package.jsondependencies/devDependencies.mise.tomldefines tasks as[tasks.<name>]with arun = "...". Theaggregate CI task is:
There is no
[tasks.audit]..github/workflows/ci.yml— thelinux-staticjob runs a sequence ofmise run …steps (format-check, workflow-lint, lint, typecheck,validate-bundles, build, install-smoke). It must stay hand-curated (per
AGENTS.md: "Keep.github/workflows/ci.ymlhand-curated").The audit tooling (verified)
aube auditsupports:--audit-level <low|moderate|high|critical>— only fail/print at or above aseverity (default
low).--fix=update— refresh the lockfile to patched versions allowed by existingversion ranges (no
package.jsonchanges).--fix=override— writepackage.jsonoverrides forcing patched versions.--dev— audit onlydevDependencies.aube auditmutatesaube-lock.yaml/package.jsononly when--fixispassed; a bare
aube auditis read-only.Commands you will need
aube auditaube audit --audit-level highaube audit --fix=updateaube audit --fix=overrideaube installnpm run typechecknpm run buildnpm run test:unitmise run workflow-lintmise run auditScope
In scope:
package.json— only if--fix=overrideadds anoverrides/pnpm.overridesblock to clear advisories.
aube-lock.yaml— regenerated byaube audit --fix/aube install.mise.toml— add[tasks.audit]and reference it from[tasks.ci]..github/workflows/ci.yml— add one audit step to thelinux-staticjob.Out of scope:
playwright,ink,vitest,ghostty-web) to chase a transitive — overrides are the surgical fix. If onlya direct-major bump can clear a high advisory, that is a STOP condition.
CHANGELOG.md— automation-owned (Communique/release-please); never edit it.src/code change. This plan is dependency + CI config only.quality-gates-macos) — it intentionally omits release-onlytooling; do not add the audit step there.
Git workflow
advisor/002-dependency-audit-gateci: gate CI on high-severity dependency advisories.If overrides are written, a second commit like
chore(deps): override ws/vite/esbuild to patched versionsis fine.Steps
Step 1: Capture the current advisory baseline
Run
aube auditand save the output. Confirm it roughly matches the tableabove (versions/advisories may have shifted slightly since planning — that's
fine; work from the live list). Then run
aube audit --audit-level highandnote exactly which high advisories are reported — those are the ones the
gate (Step 3) will require to be clear.
Verify:
aube auditprints a non-empty advisory list including at least onehigh.Step 2: Clear the advisories
aube audit --fix=update(patches reachable within existing ranges).aube audit --audit-level high. If high advisories remain, runaube audit --fix=overrideto force the patched versions (this writes anoverrides block to
package.json).aube installto ensure the lockfile is consistent.aube audit --audit-level high.Verify:
aube audit --audit-level highreports 0 high (and 0 critical)vulnerabilities. (Moderate/low may remain — see Maintenance notes.)
Step 3: Confirm nothing broke
The overrides force newer transitive versions; confirm the toolchain still works:
npm run typecheck→ exit 0.npm run build→ exit 0.npm run test:unit→ all pass.If feasible in this environment, also run
npm run test:e2e(it exercises theghostty-web/Playwright path that pulls vite/esbuild/ws). If e2e can't run here,
note that in your report.
Verify: typecheck, build, and unit tests all green.
Step 4: Add the
auditmise taskIn
mise.toml, add a task (place it near[tasks.lint]):Then add
mise run auditto the[tasks.ci]chain — put it right aftermise run lint:Verify:
mise run audit→ exit 0 (matches Step 2's clean high-level audit).Step 5: Wire the audit into CI
In
.github/workflows/ci.yml, in thelinux-staticjob, add a step afterthe existing "Lint" step (
run: mise run lint):Keep the file hand-curated (don't regenerate it). Do not touch any other job.
Verify:
mise run workflow-lint→ exit 0 (actionlint + zizmor accept thenew step).
Test plan
This change is config/dependency only; the "tests" are the audit and build
gates themselves:
aube audit --audit-level high→ 0 high/critical.mise run audit→ exit 0.npm run typecheck && npm run build && npm run test:unit→ all green(proves the forced transitive versions are compatible).
mise run workflow-lint→ exit 0 (proves the CI edit is valid).No new unit test file is required.
Done criteria
ALL must hold:
aube audit --audit-level highreports 0 high and 0 critical advisories.grep -n "tasks.audit" mise.tomlandgrep -n "mise run audit" mise.tomlboth match (task defined and in the
cichain).grep -n "Audit dependencies" .github/workflows/ci.ymlmatches, under thelinux-staticjob.mise run workflow-lintexits 0.npm run typecheck,npm run build,npm run test:unitall exit 0.src/files modified; noCHANGELOG.mdchange (git status).plans/README.mdstatus row updated.STOP conditions
Stop and report back (do not improvise) if:
--fix=update/--fix=overrideand would require a major bump of a direct dependency (
playwright,ink,vitest,ghostty-web) — report the residual advisory and its reachability;the maintainer decides whether to gate at
criticalinstead or accept the risk.npm run buildornpm run test:unitfails and a quick,in-range version adjustment doesn't fix it (a forced version is incompatible).
aube auditis unavailable in your environment (e.g.aubenot installed) —do not substitute
npm audit(the repo has nopackage-lock.json;npm auditerrors with ENOLOCK here). Report instead.
critical in a direct dependency) — surface it rather than silently fixing.
Maintenance notes
highdeliberately: it blocks the genuinely actionableadvisories without making CI hostage to every low-signal transitive moderate.
If the team wants moderates gated too, change
--audit-level hightomoderateonce the current moderates (brace-expansion ReDoS, ws uninitializedmemory) are also cleared.
--fix=overridepins transitive versions inpackage.json. When the upstreamdirect deps catch up to patched transitives, those overrides can be removed —
a reviewer should periodically check whether the overrides block is still
needed (
aube auditafter deleting it).linux-static, not inquality-gates-macos(which intentionally installs a reduced toolset).build/tooling and non-attacker-facing runtime paths; the value is the gate and
hygiene, not an active-exploit fix. State that honestly.
003-fix-release-contract-version.md
Plan 003: RELEASE.md no longer claims a stale "0.2.x" release line
Status
c11e2e2, 2026-06-16Why this matters
RELEASE.mdis the user-facing support contract —README.mdlinks to it("The supported contract is in
RELEASE.md"). Its opening still says thedocument covers the "current
0.2.xrelease line" and calls0.2.0"the firststable cut", but the product is at
0.4.3(package.json) and the project nowreleases via release-please, which bumps the version automatically. A reader
checking what's supported sees a version line that is two minor releases stale.
The body of the contract is capability-based and still accurate; only the
version framing in the first two lines is wrong. Making that framing
version-agnostic fixes the drift and prevents it from recurring on the next
release-please bump.
Current state
RELEASE.md:1-7(the only stale part — the rest of the file iscapability-based and correct):
package.json:3is"version": "0.4.3". The linked filesdocs/RELEASE-PROCESS.md,CHANGELOG.md, anddogfood/CATALOG.mdall exist(verified) — do not change those links.
The rest of
RELEASE.md(the "Supported capabilities", "Explicitly out ofscope", "Known limitations", "Validation" sections, lines 9-39) is accurate and
must not change — note that line 20 already correctly references the shipped
libghostty-vtsemantic renderer.Conventions to follow
*.md(seemise.tomlformat-checksources),so run the formatter after editing.
version bumps don't re-stale it. Do not hardcode
0.4.x(it would driftagain); describe the contract without pinning a release-line number.
Commands you will need
npm run formatnpm run format:checkScope
In scope:
RELEASE.md— only lines 3-4 (the version framing).Out of scope:
RELEASE.md(lines 9-39).README.md,CHANGELOG.md(automation-owned),package.json, and anyrelease workflow.
Git workflow
advisor/003-fix-release-contract-versiondocs: make the RELEASE.md support contract version-agnostic.Steps
Step 1: Make the opening version-agnostic
Replace lines 3-4 of
RELEASE.mdwith version-agnostic phrasing. Target text(keep line 5 — "If a workflow depends…" — and everything below unchanged):
(The exact wording can vary, but it must not name a specific
0.2.x/0.x"current" release line. The first stable baseline reference to
0.1.xishistorically accurate and fine to keep.)
Verify:
grep -n "0.2" RELEASE.md→ returns nothing (no remaining0.2.x/
0.2.0references).Step 2: Format
Run
npm run format, thennpm run format:check→ exit 0.Verify:
npm run format:check→ exit 0.Test plan
No code; the checks are:
grep -n "0\.2\.[0-9x]" RELEASE.md→ no matches.npm run format:check→ exit 0.Done criteria
ALL must hold:
grep -nE "0\.2\.[0-9x]" RELEASE.mdreturns no matches.RELEASE.mdno longer contains the phrase "first stable cut" tied to aversion (or any "current
0.x.yrelease line" claim).npm run format:checkexits 0.RELEASE.mdis modified (git status); noCHANGELOG.mdchange.plans/README.mdstatus row updated.STOP conditions
Stop and report back if:
RELEASE.md's opening no longer matches the excerpt above (it was alreadyedited).
(
ls docs/RELEASE-PROCESS.md CHANGELOG.md dogfood/CATALOG.md) — that's aseparate doc-rot finding; report it, don't fix it here.
Maintenance notes
re-stale this file. If the team later wants an explicit version stamp, the
durable way is a release-please-managed marker (like the
<!-- x-release-please-version -->comment used inREADME.md) rather thanhand-edited prose — that's a deliberate follow-up, not part of this plan.
004-hostmain-characterization-tests.md
Plan 004: hostMain's pure decision helpers and the idle-timeout path are covered by tests
Status
c11e2e2, 2026-06-16Why this matters
src/host/hostMain.ts(1094 lines) is the per-session orchestration core — itowns the PTY, event log, renderer polling, RPC dispatch, idle timeout, and
shutdown. Its entire unit test today is 9 lines asserting one exported
constant (
test/unit/host/hostMain.test.ts). The happy path is exercisedindirectly by integration/e2e tests (which run the real CLI), but the file's
decision helpers — exit-signal normalization, the commandability predicate
that gates every input/control RPC, and renderer-name resolution with its
env/default fallback — have no targeted tests, and one observable orchestration
branch (idle-timeout auto-exit) has no dedicated coverage. These are exactly the
small, branch-y functions where a regression slips through "the integration
test still passed". Characterizing them now pins the current behavior and makes
later refactors safe.
Current state
src/host/hostMain.tsis one largerunHost(sessionId)function with innerclosures, plus a handful of module-level pure helpers near the top. Only
MAX_CONSECUTIVE_POLL_FAILURESis currently exported:Relevant imports already in
hostMain.ts:SessionStatefrom./sessionState.js(line 15)isCommandableSessionStatusfrom../protocol/sessionStatusPolicy.js(line 20)— a pure predicate:
isCommandableSessionStatus(status: SessionStatus): boolean(
src/protocol/sessionStatusPolicy.ts:111). Commandable statuses are therunning-family perCONTEXT.md("ArunningSession is Commandable"; anexiting/destroying/terminal Session is not).resolveRendererName,DEFAULT_RENDERER_NAME,RendererName(lines 42-44),HOST_RENDERER_ENV_KEY(line 40),ERROR_CODES/makeCliError(line 19).Conventions to follow
describe/it/expect). See the existing host testsin
test/unit/host/for structure —runCompletionCoordinator.test.tsandeventLog.test.tsare substantial, idiomatic examples.SessionStatetest double for the commandability tests, model ontest/unit/commands/gc.test.ts, which already constructsSessionStateinstances — reuse that exact construction shape rather than inventing one.
CliError: check the.codeagainstERROR_CODES(e.g.ERROR_CODES.SESSION_NOT_RUNNING,ERROR_CODES.INVALID_INPUT). Look at anexisting test that asserts on a thrown
CliErrorfor the pattern.process.env, save and restore it (beforeEach/afterEach) so it doesn't leak into other tests..jsimport extensions,import typefor types.MAX_CONSECUTIVE_POLL_FAILURESis already exported for exactly that reason.Commands you will need
npm run typechecknpm run lintnpx vitest run test/unit/host/hostMain.test.tsnpm run test:unitnpm run test:integrationScope
In scope:
src/host/hostMain.ts— addexportto the four pure helpers only(
normalizeExitSignal,isSessionCommandable,assertSessionCommandable,resolveHostRendererName). No logic changes.test/unit/host/hostMain.test.ts— expand with the new unit tests.test/integration/— one new test for the idle-timeout path (Step 3), or acase added to
test/integration/lifecycle.test.ts.Out of scope:
hostMain.ts. This plan only addsexportandadds tests. If you find yourself changing logic, stop.
runHostor extracting the inner closures (that is plan 006'sterritory, and not required here).
CHANGELOG.md(automation-owned).Git workflow
advisor/004-hostmain-characterization-teststest: characterize hostMain decision helpers and idle-timeout exit.Steps
Step 1: Export the four pure helpers
In
src/host/hostMain.ts, add theexportkeyword tonormalizeExitSignal(line 87),
isSessionCommandable(96),assertSessionCommandable(100), andresolveHostRendererName(116). Change nothing else.Verify:
npm run typecheck→ exit 0.npm run lint→ exit 0.Step 2: Unit-test the helpers
Rewrite
test/unit/host/hostMain.test.tsto keep the existingMAX_CONSECUTIVE_POLL_FAILURESassertion and adddescribeblocks:normalizeExitSignal:null→null0→null9→'9',15→'15'isSessionCommandable/assertSessionCommandable(buildSessionStateper
test/unit/commands/gc.test.ts):runningSessionState →isSessionCommandableistrue;assertSessionCommandabledoes not throw.exited(and anexiting) SessionState →isSessionCommandableis
false;assertSessionCommandablethrows aCliErrorwith codeERROR_CODES.SESSION_NOT_RUNNINGand message'Session is not running.'.resolveHostRendererName(save/restoreprocess.envaround each case):'libghostty-vt'→ resolves to that name.undefinedwithHOST_RENDERER_ENV_KEYset → resolves from the env var.undefined, no env → resolves toDEFAULT_RENDERER_NAME.'nope') → throws aCliErrorwith codeERROR_CODES.INVALID_INPUT.Verify:
npx vitest run test/unit/host/hostMain.test.ts→ all pass(the original constant test plus the new ones).
Step 3: Integration-test the idle-timeout exit branch
createexposes--idle-timeout-ms <ms>(src/cli/main.ts:326). Add a test(model on
test/integration/lifecycle.test.ts, which already drivescreate/inspect/destroyagainst an isolated absoluteAGENT_TTY_HOME):internal idle-check cadence — note
IDLE_CHECK_CAP_MS = 5_000inhostMain.ts, so the poll cadence is bounded at 5s; choose a timeout and await that are robust to that, e.g. a timeout of a few hundred ms and then poll
inspectuntil the status is terminal, with a generous overall deadline).exited) viainspect --jsonwithout any further input.
touch the real
~/.agent-tty.If the idle-timeout behavior is not cleanly observable via
inspectwithin areasonable, non-flaky wait, stop and report (see STOP conditions) rather
than adding a sleep-and-hope test — the unit tests in Step 2 are the required
core; this integration test is the bonus branch.
Verify:
npx vitest run test/integration/<your-file>.test.ts→ passes.Run it a second time to confirm it is not flaky.
Step 4: Full static + suites
npm run lint,npm run typecheck,npm run test:unit, thennpm run test:integration→ all green.Test plan
normalizeExitSignal(4+ cases incl. throw),commandability predicate + assertion (running / exiting / terminal),
renderer-name resolution (explicit / env / default / invalid-throws).
inspect.test/unit/host/runCompletionCoordinator.test.tsand
test/unit/commands/gc.test.ts(forSessionState); integration → modelon
test/integration/lifecycle.test.ts.npm run test:unitandnpm run test:integrationboth pass,including the new cases.
Done criteria
ALL must hold:
grep -nE "^export function (normalizeExitSignal|isSessionCommandable|assertSessionCommandable|resolveHostRendererName)" src/host/hostMain.ts→ 4 matches.npx vitest run test/unit/host/hostMain.test.tspasses with the new cases(and still asserts
MAX_CONSECUTIVE_POLL_FAILURES === 10).npm run test:unitandnpm run test:integrationexit 0.npm run typecheckandnpm run lintexit 0.git diff src/host/hostMain.tsshows only addedexportkeywords (nologic change).
CHANGELOG.mdchange; no files outside scope modified (git status).plans/README.mdstatus row updated.STOP conditions
Stop and report back if:
excerpts (drift).
SessionStatefor the commandability tests requires more than theshape used in
test/unit/commands/gc.test.ts(e.g. a live PTY) — report andscope those two cases out rather than constructing a heavy fake.
fixed
sleepand is flaky on a second run — drop Step 3, keep Steps 1–2, andreport that Step 3 needs a deterministic hook.
CliError's.codedoesn't work as described (the errorshape differs) — report the actual shape.
Maintenance notes
change intentionally alters, say, commandability rules, the test should be
updated deliberately in the same change — a failure here on an unrelated PR is
a real regression signal.
runHost(renderer-poll-failurerecovery, shutdown reconciliation, concurrent-wait handling) remain
unit-untestable without extracting them from the closure. That extraction is
deliberately not in this plan; it's a candidate follow-up that would pair
well with plan 006's refactoring approach.
Plans 005 and 006 follow in a comment below (GitHub PR body size limit).
🤖 Generated with Claude Code