Skip to content

Harden history lock windows and add in-process CLI runner foundation#67

Merged
unbraind merged 3 commits into
mainfrom
chore/reliability-lock-window-inprocess-runner-foundation
May 25, 2026
Merged

Harden history lock windows and add in-process CLI runner foundation#67
unbraind merged 3 commits into
mainfrom
chore/reliability-lock-window-inprocess-runner-foundation

Conversation

@unbraind
Copy link
Copy Markdown
Owner

Summary

  • harden history-redact, history-repair, and restore against read-then-lock TOCTOU windows by verifying history/item snapshots under lock before writing
  • add an explicit runPmCli(argv) invocation seam and in-process test runner helper path for pm-7rlp, including a dedicated subprocess-vs-in-process parity integration test
  • update pm tracker evidence in detail for pm-7rlp/pm-uer0, close stale completed items (pm-yf8t, pm-mf4j, pm-5k2w, pm-ash0, pm-0c9q), and backfill required closure metadata

Test plan

  • node scripts/run-tests.mjs test -- tests/unit/history-activity-command.spec.ts tests/unit/history-repair-command.spec.ts tests/unit/restore-command.spec.ts --reporter=dot
  • node scripts/run-tests.mjs test -- tests/integration/cli-inprocess-runner.integration.spec.ts tests/integration/release-readiness-runtime.spec.ts -t 'keeps subprocess and in-process runner behavior aligned for core flows|keeps Sentry startup lazy for fast CLI commands' --reporter=dot
  • node scripts/release/run-gates.mjs --telemetry-mode required --max-sentry-critical 10 --max-sentry-high 20 --max-telemetry-error-rate 10 --json
  • manual temp-directory reliability walkthrough covering init/create/update/history-redact/history-repair/restore/install/doctor/close/release
  • node scripts/smoke-codex-plugin-mcp.mjs
  • pm validate --check-resolution --check-history-drift --json
  • pm health --check-only --brief --json
  • node scripts/check-secrets.mjs
  • node scripts/release/sentry-telemetry-gate.mjs --json --telemetry-mode required --max-critical 10 --max-high 20 --max-telemetry-error-rate 10 --max-telemetry-missing-error-rows 0

Prevent stale pre-lock history/item snapshots from being written in history-redact, history-repair, and restore, add an exported runPmCli invocation seam with in-process CLI parity coverage, and record/normalize pm tracker closures and verification evidence for completed stale items.
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @unbraind, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

Review Change Stack

Warning

Review limit reached

@unbraind, we couldn't start this review because you've used your available PR reviews for now.

Your plan includes 1 review of capacity. Refill in 34 minutes and 51 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: c2d314f2-9345-4390-8cb5-1f944e1057a0

📥 Commits

Reviewing files that changed from the base of the PR and between 505c1d0 and 09b7e3d.

📒 Files selected for processing (9)
  • .agents/pm/chores/pm-7rlp.toon
  • .agents/pm/chores/pm-uer0.toon
  • .agents/pm/features/pm-rnpb.toon
  • .agents/pm/history/pm-7rlp.jsonl
  • .agents/pm/history/pm-rnpb.jsonl
  • .agents/pm/history/pm-uer0.jsonl
  • CHANGELOG.md
  • src/cli/commands/history-repair.ts
  • tests/helpers/cliRunner.ts
📝 Walkthrough

Walkthrough

This PR introduces in-process CLI test execution infrastructure by exporting a parameterized runPmCli() function and building test helpers to dynamically load and invoke it with captured output. Concurrently, it hardens read-then-lock TOCTOU patterns across history commands (history-redact, history-repair, restore) by snapshot-and-revalidate conflict detection.

Changes

In-process CLI runner foundation

Layer / File(s) Summary
CLI entrypoint refactoring
src/cli/main.ts, src/cli.ts
Export runPmCli(rawArgv) from main.ts; update cli.ts to dynamically import and await runPmCli(process.argv.slice(2)) for parameterized argv handling.
In-process test runner helper
tests/helpers/cliRunner.ts
Add runInProcessDistCli() that dynamically imports the built CLI, captures stdout/stderr by patching write methods, invokes runPmCli(args), and restores process state; include output normalization and environment override helpers.
Test context integration
tests/helpers/withTempPmPath.ts
Extend TempPmContext interface with runCliInProcess method; wire test context with runNodeCliInProcess wrapper for in-process CLI invocation with argument normalization and environment passthrough.
Integration parity test
tests/integration/cli-inprocess-runner.integration.spec.ts
New test suite verifying CLI behavior parity between subprocess and in-process runners for create/list-open flows and error handling on invalid flags.

Lock-window TOCTOU conflict detection

Layer / File(s) Summary
history-redact command hardening
src/cli/commands/history-redact.ts
Snapshot history stream before lock; re-read and compare under lock to detect concurrent mutations; throw conflict error if history or item changed while waiting; adjust rollback to restore pre-lock state.
history-repair command hardening
src/cli/commands/history-repair.ts
Capture pre-lock snapshots of history and item raw content; re-locate, re-load, and re-verify both under lock; throw conflict error on mismatch; adjust error recovery rollback logic.
restore command hardening
src/cli/commands/restore.ts
Add pre-lock snapshots of history and item state; re-read and compare under lock; throw conflict error if changed; re-resolve item under lock and initialize resolved state from locked read (with replay fallback).
Conflict regression test coverage
tests/unit/history-activity-command.spec.ts, tests/unit/history-repair-command.spec.ts, tests/unit/restore-command.spec.ts
New unit tests spy on lock acquisition, mutate state before lock proceeds, and verify commands reject with conflict error; update Vitest imports to include vi for mocking/spying.

Sequence Diagram(s)

sequenceDiagram
  participant srcCli as src/cli.ts
  participant srcCliMain as src/cli/main.ts
  participant distCli as dist/cli/main.js
  participant subprocess as subprocess runner
  participant inprocess as in-process runner
  
  srcCli->>srcCliMain: dynamic import & await runPmCli(argv)
  activate srcCliMain
  srcCliMain->>srcCliMain: bootstrap, register extensions, parse command
  srcCliMain->>srcCliMain: execute command (history-redact, restore, etc.)
  deactivate srcCliMain
  
  subprocess->>distCli: spawn child process, import normally
  inprocess->>distCli: dynamic import (cache-busted), call runPmCli(args)
  subprocess-->>subprocess: capture stdout/stderr via subprocess streams
  inprocess-->>inprocess: capture stdout/stderr via write() patching
  
  Note over subprocess,inprocess: Both runners verify command exit code & output match
Loading
sequenceDiagram
  participant command as history-redact/repair/restore
  participant store as item store
  participant lock as lock acquisition
  participant disk as disk file
  
  command->>disk: snapshot history & item state (before lock)
  command->>lock: acquire lock
  activate lock
  command->>disk: re-read history & item state (under lock)
  command->>command: compare pre-lock vs. under-lock snapshots
  alt state changed while waiting
    command->>command: throw CONFLICT error
  else state unchanged
    command->>disk: perform write operation
    command->>command: commit
  end
  deactivate lock
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • unbraind/pm-cli#44: Directly related history-repair audit and re-anchor work touching the same command flow.
  • unbraind/pm-cli#54: Related CLI execution and module-loading refactoring in src/cli/main.ts.
  • unbraind/pm-cli#48: Related CLI bootstrap changes in src/cli.ts around dynamic import and compile-cache enablement.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: hardening history lock windows and adding an in-process CLI runner foundation.
Description check ✅ Passed The description provides meaningful detail about changes to history command handling, CLI runner infrastructure, and tracker evidence updates, all related to the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
tests/integration/cli-inprocess-runner.integration.spec.ts (1)

26-30: ⚡ Quick win

Tighten invalid-flag parity check by asserting subprocess stderr too.

Right now this branch only checks the in-process output contains --bogus-flag. Add the same assertion for subprocess output so parity failures are caught symmetrically.

✅ Minimal test hardening
       const subprocessUsage = context.runCli(["list-open", "--bogus-flag"]);
       const inProcessUsage = await context.runCliInProcess(["list-open", "--bogus-flag"]);
       expect(inProcessUsage.code).toBe(subprocessUsage.code);
+      expect(subprocessUsage.stderr).toContain("--bogus-flag");
       expect(inProcessUsage.stderr).toContain("--bogus-flag");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/cli-inprocess-runner.integration.spec.ts` around lines 26 -
30, The test checks that inProcessUsage.stderr contains "--bogus-flag" but
doesn't assert the same for subprocessUsage, allowing asymmetric parity; update
the test that uses context.runCli([...]) and context.runCliInProcess([...]) to
also assert that subprocessUsage.stderr contains "--bogus-flag" (mirror the
inProcess assertion) so both subprocessUsage and inProcessUsage stderr outputs
are compared for the invalid-flag case.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.agents/pm/chores/pm-ash0.toon:
- Around line 5-24: The PR edited pm tracker artifacts directly (the pm system
of record) instead of using the pm workflow; revert the manual edits to the
tracker artifacts and reapply the naming migration through the pm system
workflow so generated artifacts are produced and committed by that flow; run the
pm workflow/CLI to rename plugin IDs to pm-claude/pm-codex, MCP identity to
pm-mcp and packages to `@unbrained/pm-`*, regenerate plugin metadata (plugin.json)
and MCP server artifacts, then rerun the smoke and test suite to confirm
acceptance criteria and commit only the generated outputs.

In @.agents/pm/history/pm-0c9q.jsonl:
- Around line 90-95: The patch illegally mutates PM tracker JSON fields
(adds/replaces keys like metadata.assignee, metadata.status,
metadata.close_reason and other metadata entries) directly in the repository;
revert those direct edits and instead perform the closure/update via the
authoritative pm workflow/tooling (open or update the PM record through the pm
workflow UI or API) so that metadata changes (assignee, status, resolution,
expected_result, actual_result) are applied through the system-of-record rather
than committing edits to the JSON history file.

In @.agents/pm/history/pm-rnpb.jsonl:
- Around line 621-622: Revert the manual additions to the history JSON (the
inserted metadata.comments entries 273 and 274 with timestamps
2026-05-25T10:08:53.523Z and 2026-05-25T10:19:49.187Z) so the file no longer
contains hand-edited records, and regenerate or add those records using the
official pm workflow/tool (run the pm workflow that appends to the history
stream) instead of editing metadata/comments directly; ensure future changes to
the history stream are produced by the pm system-of-record process, not by
committing edits to the history JSON.

In @.agents/pm/history/pm-uer0.jsonl:
- Around line 2-13: Revert the direct edits made to the PM system-of-record file
.agents/pm/history/pm-uer0.jsonl (the added/modified metadata entries like
metadata.assignee, metadata.status, metadata/files, metadata/comments, etc.) and
restore the file to its canonical state (undo the commit or fetch the version
from main/master); then re-apply any required changes through the sanctioned PM
workflow/tooling (create the claim/update via the PM system so metadata updates
appear through the proper API rather than direct file edits).

In `@tests/helpers/cliRunner.ts`:
- Around line 123-143: The global stream monkey-patch in the cli runner replaces
process.stdout.write and process.stderr.write (see the anonymous write wrappers
that use chunkToString and append to stdout/stderr) and must be serialized to
avoid races; implement a module-level mutex/queue so only one in-process capture
runs at a time: before replacing the writers acquire the lock (enqueue), perform
the patching and test invocation, then restore the original writers and release
the lock (dequeue); ensure the lock protects both stdout and stderr patches and
that the original writer references are stored and restored even on exceptions
so stdout/stderr and chunkToString behavior remain consistent.
- Around line 117-177: The finally block must not perform control flow or
construct/return the result; move the global state mutations (setting
process.argv, process.exitCode, process.chdir(options.cwd), overriding
process.stdout.write/process.stderr.write using
previousStdoutWrite/previousStderrWrite) into the protected try block before
calling loaded.runPmCli(args), ensure capturedError is set in the catch, and
restrict finally to only restore globals (reset writes, process.argv,
process.exitCode, cwd, restoreEnv). After the try/catch/finally sequence, build
and return the DirectCliRunResult (including JSON parse for options.expectJson)
so no return occurs inside finally and setup failures still get restored.

---

Nitpick comments:
In `@tests/integration/cli-inprocess-runner.integration.spec.ts`:
- Around line 26-30: The test checks that inProcessUsage.stderr contains
"--bogus-flag" but doesn't assert the same for subprocessUsage, allowing
asymmetric parity; update the test that uses context.runCli([...]) and
context.runCliInProcess([...]) to also assert that subprocessUsage.stderr
contains "--bogus-flag" (mirror the inProcess assertion) so both subprocessUsage
and inProcessUsage stderr outputs are compared for the invalid-flag case.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 30422769-d7e6-4ee4-a9c9-ddc96b59e159

📥 Commits

Reviewing files that changed from the base of the PR and between ae3d884 and 505c1d0.

📒 Files selected for processing (27)
  • .agents/pm/chores/pm-5k2w.toon
  • .agents/pm/chores/pm-7rlp.toon
  • .agents/pm/chores/pm-ash0.toon
  • .agents/pm/chores/pm-mf4j.toon
  • .agents/pm/chores/pm-uer0.toon
  • .agents/pm/features/pm-rnpb.toon
  • .agents/pm/history/pm-0c9q.jsonl
  • .agents/pm/history/pm-5k2w.jsonl
  • .agents/pm/history/pm-7rlp.jsonl
  • .agents/pm/history/pm-ash0.jsonl
  • .agents/pm/history/pm-mf4j.jsonl
  • .agents/pm/history/pm-rnpb.jsonl
  • .agents/pm/history/pm-uer0.jsonl
  • .agents/pm/history/pm-yf8t.jsonl
  • .agents/pm/issues/pm-yf8t.toon
  • .agents/pm/tasks/pm-0c9q.toon
  • src/cli.ts
  • src/cli/commands/history-redact.ts
  • src/cli/commands/history-repair.ts
  • src/cli/commands/restore.ts
  • src/cli/main.ts
  • tests/helpers/cliRunner.ts
  • tests/helpers/withTempPmPath.ts
  • tests/integration/cli-inprocess-runner.integration.spec.ts
  • tests/unit/history-activity-command.spec.ts
  • tests/unit/history-repair-command.spec.ts
  • tests/unit/restore-command.spec.ts

Comment on lines +5 to +24
status: closed
priority: 0
tags[6]: branding,mcp,naming,packages,plugins,pm-cli
created_at: "2026-05-16T22:18:14.976Z"
updated_at: "2026-05-16T22:18:14.976Z"
updated_at: "2026-05-25T10:21:14.220Z"
author: maintainer-agent
acceptance_criteria: All MCP server registrations use 'pm-mcp'; all plugin dirs and plugin names use pm-claude/pm-codex; all packages use @unbrained/pm-X; marketplace name normalised to pm-local where applicable; full test suite green; smoke tests for both plugins pass; plugin version bumped so existing users pick up rename on next plugin update.
parent: pm-rnpb
resolution: Completed naming migration to pm-claude/pm-codex/pm-mcp and @unbrained/pm-* in tracked source surfaces.
expected_result: No active source surface should depend on legacy pm-cli-* or @unbrained/pm-package-* naming.
actual_result: Legacy names remain only in historical tracker artifacts; active runtime/docs/plugin surfaces use unified naming.
comments[1]{created_at,author,text}:
"2026-05-25T10:10:42.863Z",codex-reliability-agent,"Closure evidence 2026-05-25: naming migration is complete in tracked source. Search for legacy names (pm-cli-claude/pm-cli-codex/pm-cli-native/@unbrained/pm-package-) now returns only historical .agents/pm artifacts; runtime files use pm-claude, pm-codex, pm-mcp and @unbrained/pm-* naming. Added file links plus codex plugin MCP smoke evidence."
files[3]{path,scope,note}:
plugins/pm-claude/.claude-plugin/plugin.json,project,renamed plugin id uses pm-claude
plugins/pm-codex/.codex-plugin/plugin.json,project,renamed plugin id uses pm-codex
src/mcp/server.ts,project,MCP server identity uses pm-mcp
tests[1]{command,path,scope,timeout_seconds,pm_context_mode,env_set,env_clear,shared_host_safe,assert_stdout_contains,assert_stdout_regex,assert_stderr_contains,assert_stderr_regex,assert_stdout_min_lines,assert_json_field_equals,assert_json_field_gte,note}:
node scripts/smoke-codex-plugin-mcp.mjs,null,project,180,null,null,null,null,null,null,null,null,null,null,null,codex plugin mcp smoke
close_reason: Naming unification is complete in current source tree; only historical tracker history retains legacy names.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not directly modify .agents/pm tracker artifacts.

These are direct edits under .agents/pm/**, which violates repository workflow policy. Please apply updates through the pm system workflow and commit generated artifacts from that flow instead of hand-editing these files.

As per coding guidelines, .agents/pm/**: pm is the system of record. Do not edit .agents/pm files directly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/pm/chores/pm-ash0.toon around lines 5 - 24, The PR edited pm tracker
artifacts directly (the pm system of record) instead of using the pm workflow;
revert the manual edits to the tracker artifacts and reapply the naming
migration through the pm system workflow so generated artifacts are produced and
committed by that flow; run the pm workflow/CLI to rename plugin IDs to
pm-claude/pm-codex, MCP identity to pm-mcp and packages to `@unbrained/pm-`*,
regenerate plugin metadata (plugin.json) and MCP server artifacts, then rerun
the smoke and test suite to confirm acceptance criteria and commit only the
generated outputs.

Comment on lines +90 to +95
{"ts":"2026-05-25T10:09:46.767Z","author":"codex-reliability-agent","op":"claim","patch":[{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:09:46.767Z"},{"op":"add","path":"/metadata/assignee","value":"codex-reliability-agent"}],"before_hash":"16873d11644a22dda525ec2e6146104336563442a1d99fa42ffe85c89ffd93ac","after_hash":"b8e07653d0459b294ff3f5f69473e81c4cb9793eb1db6a13589aafacd413d7d9"}
{"ts":"2026-05-25T10:10:13.991Z","author":"codex-reliability-agent","op":"tests_add","patch":[{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:10:13.991Z"}],"before_hash":"b8e07653d0459b294ff3f5f69473e81c4cb9793eb1db6a13589aafacd413d7d9","after_hash":"6964e452e4a056c8c6b753ef4bce06469ffed0ef00b3b1085f0906a36c80e92d"}
{"ts":"2026-05-25T10:10:45.822Z","author":"codex-reliability-agent","op":"comment_add","patch":[{"op":"add","path":"/metadata/comments/15","value":{"created_at":"2026-05-25T10:10:45.822Z","author":"codex-reliability-agent","text":"Closure evidence 2026-05-25: native Codex plugin/MCP functionality remains healthy on main. smoke-codex-plugin-mcp passed in this run, docs/routes and plugin assets are present, and existing linked tests for plugin packaging/contracts remain recorded."}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:10:45.822Z"}],"before_hash":"6964e452e4a056c8c6b753ef4bce06469ffed0ef00b3b1085f0906a36c80e92d","after_hash":"d2b0cc3390b98f7210ae4e20842e758706bb4c71a1dddeb6b92c36e90f9dad7d"}
{"ts":"2026-05-25T10:10:47.855Z","author":"codex-reliability-agent","op":"close","patch":[{"op":"remove","path":"/metadata/assignee"},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:10:47.855Z"},{"op":"replace","path":"/metadata/status","value":"closed"},{"op":"add","path":"/metadata/close_reason","value":"Native Codex plugin scope is implemented and currently verified by smoke/test evidence; open status was stale."}],"before_hash":"d2b0cc3390b98f7210ae4e20842e758706bb4c71a1dddeb6b92c36e90f9dad7d","after_hash":"42899c651327e67693e498d9bc816b35fc4f81199b62bd478d31b05c30c1bd97"}
{"ts":"2026-05-25T10:11:02.227Z","author":"codex-reliability-agent","op":"release","patch":[{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:11:02.227Z"}],"before_hash":"42899c651327e67693e498d9bc816b35fc4f81199b62bd478d31b05c30c1bd97","after_hash":"046e65527c377419c89551bb3be9dfeaacbf3be70ff433973119c724d09cb16d"}
{"ts":"2026-05-25T10:21:16.135Z","author":"codex-reliability-agent","op":"update","patch":[{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:21:16.135Z"},{"op":"add","path":"/metadata/resolution","value":"Shipped native Codex plugin + pm-mcp integration and verified runtime smoke behavior."},{"op":"add","path":"/metadata/expected_result","value":"Codex plugin resources and MCP execution should be installable and operational from the repository package."},{"op":"add","path":"/metadata/actual_result","value":"Smoke and recorded tests confirm plugin/MCP behavior; open status was stale and item is now closed with metadata."}],"before_hash":"046e65527c377419c89551bb3be9dfeaacbf3be70ff433973119c724d09cb16d","after_hash":"234fd077a1a936c861098deb4a87fb95d599e75e2571fb948091c208d2ec2766","message":"Backfill closure metadata"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not directly modify tracker system-of-record files under .agents/pm/**.

These updates directly edit .agents/pm/history/pm-0c9q.jsonl, which violates the repository process rule for PM records. Please route these mutations through the pm workflow instead of file edits.

As per coding guidelines, .agents/pm/**: pm is the system of record. Do not edit .agents/pm files directly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/pm/history/pm-0c9q.jsonl around lines 90 - 95, The patch illegally
mutates PM tracker JSON fields (adds/replaces keys like metadata.assignee,
metadata.status, metadata.close_reason and other metadata entries) directly in
the repository; revert those direct edits and instead perform the closure/update
via the authoritative pm workflow/tooling (open or update the PM record through
the pm workflow UI or API) so that metadata changes (assignee, status,
resolution, expected_result, actual_result) are applied through the
system-of-record rather than committing edits to the JSON history file.

Comment on lines +621 to +622
{"ts":"2026-05-25T10:08:53.523Z","author":"codex-reliability-agent","op":"comment_add","patch":[{"op":"add","path":"/metadata/comments/273","value":{"created_at":"2026-05-25T10:08:53.523Z","author":"codex-reliability-agent","text":"Reliability continuation progress 2026-05-25: completed pm-uer0 lock-window hardening and next pm-7rlp in-process runner seam foundation in one branch. Added stale-snapshot conflict guards for history-redact/history-repair/restore, introduced exported runPmCli invocation seam + in-process test helper path, and landed parity regression tests. Focused regression runs are passing; proceeding with full verification/manual temp-dir dogfood and stale-item hygiene closures in this PR stream."}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:08:53.523Z"}],"before_hash":"d26856a1dcd77ac5a7745595d8acb658bb4146c882fde1048281b44841919280","after_hash":"dce72122e1ae0e77a6e83e70f053353cb72197f6d7886bf3e16abe5fd01b4114"}
{"ts":"2026-05-25T10:19:49.187Z","author":"codex-reliability-agent","op":"comment_add","patch":[{"op":"add","path":"/metadata/comments/274","value":{"created_at":"2026-05-25T10:19:49.187Z","author":"codex-reliability-agent","text":"Verification summary 2026-05-25 reliability continuation branch: full release gate passed (1766 tests, 100% coverage) after static-gate compliant split of new in-process parity test file; manual temporary-directory reliability scenario passed end-to-end including history-redact/history-repair/restore and package doctor. GitHub snapshot before PR: open issues=0, open PRs=0, dependabot alerts open=0; remaining failed scheduled historical runs are pre-fix Nightly/Auto Release entries already captured in lineage."}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:19:49.187Z"}],"before_hash":"dce72122e1ae0e77a6e83e70f053353cb72197f6d7886bf3e16abe5fd01b4114","after_hash":"c6b819a8b5844785bd73743c9b994d775777c9cce54ff853693f6fe30e85648e"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid direct edits under .agents/pm history files.

Line 621 and Line 622 add records directly in .agents/pm/history/pm-rnpb.jsonl. These updates should be produced via pm workflows, not hand-edited in PR diffs.

As per coding guidelines, .agents/pm/**: pm is the system of record. Do not edit .agents/pm files directly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/pm/history/pm-rnpb.jsonl around lines 621 - 622, Revert the manual
additions to the history JSON (the inserted metadata.comments entries 273 and
274 with timestamps 2026-05-25T10:08:53.523Z and 2026-05-25T10:19:49.187Z) so
the file no longer contains hand-edited records, and regenerate or add those
records using the official pm workflow/tool (run the pm workflow that appends to
the history stream) instead of editing metadata/comments directly; ensure future
changes to the history stream are produced by the pm system-of-record process,
not by committing edits to the history JSON.

Comment on lines +2 to +13
{"ts":"2026-05-25T10:00:06.271Z","author":"codex-reliability-agent","op":"claim","patch":[{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:00:06.271Z"},{"op":"add","path":"/metadata/assignee","value":"codex-reliability-agent"}],"before_hash":"faacde033604676006f422597a3639d142d61b00f12df859ec69939d60fbf4c7","after_hash":"1bcb00b2394726258a4ea69f25831bae3c4c4504aa7191372c73afe1cd0f5b64"}
{"ts":"2026-05-25T10:00:08.864Z","author":"codex-reliability-agent","op":"update","patch":[{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:00:08.864Z"},{"op":"replace","path":"/metadata/status","value":"in_progress"}],"before_hash":"1bcb00b2394726258a4ea69f25831bae3c4c4504aa7191372c73afe1cd0f5b64","after_hash":"d141f755a15dd951cf5539d86875a25d4f4acfbb70491fbe783af58ad9a43511","message":"Start lock-window reliability implementation"}
{"ts":"2026-05-25T10:00:16.815Z","author":"codex-reliability-agent","op":"comment_add","patch":[{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:00:16.815Z"},{"op":"add","path":"/metadata/comments","value":[{"created_at":"2026-05-25T10:00:16.815Z","author":"codex-reliability-agent","text":"Execution start 2026-05-25 under combined reliability PR scope with pm-7rlp. Duplicate-check evidence recorded (context/search/list-open/list-in-progress) and canonical item reuse confirmed; no new item created. Claim taken and status moved to in_progress before code edits."}]}],"before_hash":"d141f755a15dd951cf5539d86875a25d4f4acfbb70491fbe783af58ad9a43511","after_hash":"22f004ededd6603021a6dfe6b7723d93a9669cf009d1a8435a52672c86ae454c"}
{"ts":"2026-05-25T10:08:08.491Z","author":"codex-reliability-agent","op":"files_add","patch":[{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:08:08.491Z"},{"op":"add","path":"/metadata/files","value":[{"path":"src/cli/commands/history-redact.ts","scope":"project","note":"lock-window snapshot verification before write"}]}],"before_hash":"22f004ededd6603021a6dfe6b7723d93a9669cf009d1a8435a52672c86ae454c","after_hash":"cb8cc11cc12a7301c94ea056c6bfddde97dbedece469eaf901f677f3b1cbcdce"}
{"ts":"2026-05-25T10:08:11.606Z","author":"codex-reliability-agent","op":"files_add","patch":[{"op":"add","path":"/metadata/files/1","value":{"path":"src/cli/commands/history-repair.ts","scope":"project","note":"lock-window snapshot verification before write"}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:08:11.606Z"}],"before_hash":"cb8cc11cc12a7301c94ea056c6bfddde97dbedece469eaf901f677f3b1cbcdce","after_hash":"3b5f3bebf72d8b1fedce1ce0d17a9410ec23f91054260ef6a218e9fb465b835f"}
{"ts":"2026-05-25T10:08:13.558Z","author":"codex-reliability-agent","op":"files_add","patch":[{"op":"add","path":"/metadata/files/2","value":{"path":"src/cli/commands/restore.ts","scope":"project","note":"lock-window snapshot verification before write"}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:08:13.558Z"}],"before_hash":"3b5f3bebf72d8b1fedce1ce0d17a9410ec23f91054260ef6a218e9fb465b835f","after_hash":"619450b17e8abb1f4800232445983e66b0cd8cb0bac6c8fc0ac4e1a6c9ddbc8d"}
{"ts":"2026-05-25T10:08:15.381Z","author":"codex-reliability-agent","op":"files_add","patch":[{"op":"add","path":"/metadata/files/3","value":{"path":"tests/unit/history-activity-command.spec.ts","scope":"project","note":"history-redact stale snapshot conflict regression"}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:08:15.381Z"}],"before_hash":"619450b17e8abb1f4800232445983e66b0cd8cb0bac6c8fc0ac4e1a6c9ddbc8d","after_hash":"a3cf13a531e219cafa502e82e1f576ff009e2af4663b10a85e349d1d85e5bf19"}
{"ts":"2026-05-25T10:08:17.208Z","author":"codex-reliability-agent","op":"files_add","patch":[{"op":"add","path":"/metadata/files/4","value":{"path":"tests/unit/history-repair-command.spec.ts","scope":"project","note":"history-repair stale snapshot conflict regression"}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:08:17.208Z"}],"before_hash":"a3cf13a531e219cafa502e82e1f576ff009e2af4663b10a85e349d1d85e5bf19","after_hash":"0f1baa1795d2ccee46c94908c5d63d00b3295698a29fb1cc15efde1b4d924876"}
{"ts":"2026-05-25T10:08:19.438Z","author":"codex-reliability-agent","op":"files_add","patch":[{"op":"add","path":"/metadata/files/5","value":{"path":"tests/unit/restore-command.spec.ts","scope":"project","note":"restore stale snapshot conflict regression"}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:08:19.438Z"}],"before_hash":"0f1baa1795d2ccee46c94908c5d63d00b3295698a29fb1cc15efde1b4d924876","after_hash":"00f7cdfb281ce4f8e037a3b7a629f144cd5861c7f47769f686ebb531bebdbce7"}
{"ts":"2026-05-25T10:08:36.403Z","author":"codex-reliability-agent","op":"tests_add","patch":[{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:08:36.403Z"},{"op":"add","path":"/metadata/tests","value":[{"command":"node scripts/run-tests.mjs test -- tests/unit/history-activity-command.spec.ts tests/unit/history-repair-command.spec.ts tests/unit/restore-command.spec.ts --reporter=dot","scope":"project","timeout_seconds":600,"note":"lock-window stale snapshot regressions"}]}],"before_hash":"00f7cdfb281ce4f8e037a3b7a629f144cd5861c7f47769f686ebb531bebdbce7","after_hash":"acc526beaa186c1729abf1e77a6de5e6ef54814e58e8e088d8a4a6c0b772a42f"}
{"ts":"2026-05-25T10:08:51.624Z","author":"codex-reliability-agent","op":"comment_add","patch":[{"op":"add","path":"/metadata/comments/1","value":{"created_at":"2026-05-25T10:08:51.624Z","author":"codex-reliability-agent","text":"Implementation complete: hardened TOCTOU read-then-lock windows uniformly across history-redact/history-repair/restore by snapshotting pre-lock history/item raw state and re-verifying under lock before any write. Commands now fail fast with EXIT_CODE.CONFLICT and retry guidance when the target history stream or item changed while waiting for lock, preventing stale rewrite commits. Regression coverage added in tests/unit/history-activity-command.spec.ts, tests/unit/history-repair-command.spec.ts, and tests/unit/restore-command.spec.ts via acquireLock spy mutation hooks. Verification: node scripts/run-tests.mjs test -- tests/unit/history-activity-command.spec.ts tests/unit/history-repair-command.spec.ts tests/unit/restore-command.spec.ts --reporter=dot (3 files, 43 tests passed)."}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:08:51.624Z"}],"before_hash":"acc526beaa186c1729abf1e77a6de5e6ef54814e58e8e088d8a4a6c0b772a42f","after_hash":"97f6d7efd46dd6e549baa9cd3ebbd60abc24a6c2cff1166779590b7a2a605492"}
{"ts":"2026-05-25T10:19:48.120Z","author":"codex-reliability-agent","op":"comment_add","patch":[{"op":"add","path":"/metadata/comments/2","value":{"created_at":"2026-05-25T10:19:48.120Z","author":"codex-reliability-agent","text":"Post-implementation verification 2026-05-25: lock-window stale-snapshot conflict regressions pass and full release gate passes after integration split. Manual temp-dir reliability walkthrough also exercised history-redact apply, history verify, forced drift + history-repair, restore to version 1, package install doctor, close/release, all successful with verification ok after redact and after repair."}},{"op":"replace","path":"/metadata/updated_at","value":"2026-05-25T10:19:48.120Z"}],"before_hash":"97f6d7efd46dd6e549baa9cd3ebbd60abc24a6c2cff1166779590b7a2a605492","after_hash":"2f099407719a93ba6ef7b5201c1bd3906b1606e65839988cfb8cb24df96e4161"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not modify .agents/pm history files directly.

This PR directly edits .agents/pm/history/pm-uer0.jsonl (Line 2–Line 13), which violates repository policy for PM system-of-record artifacts. Please apply these updates through the sanctioned PM workflow/tooling instead of direct file edits.

As per coding guidelines, .agents/pm/**: "pm is the system of record. Do not edit .agents/pm files directly."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/pm/history/pm-uer0.jsonl around lines 2 - 13, Revert the direct
edits made to the PM system-of-record file .agents/pm/history/pm-uer0.jsonl (the
added/modified metadata entries like metadata.assignee, metadata.status,
metadata/files, metadata/comments, etc.) and restore the file to its canonical
state (undo the commit or fetch the version from main/master); then re-apply any
required changes through the sanctioned PM workflow/tooling (create the
claim/update via the PM system so metadata updates appear through the proper API
rather than direct file edits).

Comment thread tests/helpers/cliRunner.ts Outdated
Comment thread tests/helpers/cliRunner.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

await registerCoreCommandFamilies(program, registrationSelection);

P2 Badge Reinitialize command registration per runPmCli call

Because runPmCli is now exported and callable, invoking it more than once in the same process reuses the module-level program and calls registerCoreCommandFamilies again without clearing prior command registrations. This can trigger duplicate-command registration failures on subsequent calls, which breaks the new in-process invocation seam for long-lived hosts that import once and execute multiple CLI commands.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/cli/commands/history-repair.ts Outdated
Comment on lines +83 to +86
if (historyEntries.length === 0) {
throw new PmCliError(`No history entries exist for ${subject.id}; nothing to repair.`, EXIT_CODE.USAGE);
}
const historyRawBeforeLock = await readFileIfExists(subject.historyPath);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Capture pre-lock history snapshot before parsing entries

Move the historyRawBeforeLock read so it happens before readHistoryEntries. In the current order, if another writer updates subject.historyPath after entries are parsed but before historyRawBeforeLock is captured, rewrittenEntries are computed from stale content while the later under-lock equality check still passes against the newer raw snapshot, allowing history-repair to overwrite concurrent changes without raising the intended conflict.

Useful? React with 👍 / 👎.

Comment thread tests/helpers/cliRunner.ts Outdated
Comment on lines +117 to +120
process.argv = [process.argv[0], distCliPath(), ...args];
process.exitCode = undefined;
if (options.cwd) {
process.chdir(options.cwd);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wrap cwd switch in runner restoration guard

In runInProcessDistCli, process.argv, process.exitCode, and environment overrides are applied before process.chdir(options.cwd), but the try/finally restoration block starts later. If options.cwd is invalid (or chdir otherwise throws), the function exits before restoration, leaking modified global process state into later tests and causing cascading failures.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request bundles several reliability and UX improvements, including hardening history commands against TOCTOU race conditions, fixing build ordering in the auto-release workflow, and introducing an in-process CLI runner for testing. Feedback identified critical issues in the new test runner's global state management, specifically that the environment override logic is destructive and that global mutations of the working directory and standard output streams prevent safe parallel execution.

Comment on lines +64 to +89
function applyEnvOverride(env: NodeJS.ProcessEnv | undefined): () => void {
if (!env) {
return () => {};
}
const keys = new Set<string>([...Object.keys(process.env), ...Object.keys(env)]);
const previous = new Map<string, string | undefined>();
for (const key of keys) {
previous.set(key, process.env[key]);
const next = env[key];
if (next === undefined) {
delete process.env[key];
} else {
process.env[key] = next;
}
}
return () => {
for (const key of keys) {
const prior = previous.get(key);
if (prior === undefined) {
delete process.env[key];
} else {
process.env[key] = prior;
}
}
};
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The applyEnvOverride implementation is destructive to process.env. By iterating over a union of all existing keys and the override keys, and deleting any key from process.env that is not present in the override object (next === undefined), it effectively wipes out essential environment variables like PATH, HOME, USER, etc., unless they are explicitly included in the override. This will likely cause failures in CLI commands that rely on these standard variables.

References
  1. Defensive programming: ensure that invalid inputs or states are safely handled in all cases. Wiping out the global environment is an unsafe state for a CLI tool.

Comment thread tests/helpers/cliRunner.ts Outdated
Comment on lines +123 to +132
(process.stdout as unknown as { write: typeof process.stdout.write }).write = ((chunk, encoding, callback) => {
const normalizedEncoding = typeof encoding === "string" ? (encoding as BufferEncoding) : undefined;
stdout += chunkToString(chunk, normalizedEncoding);
if (typeof encoding === "function") {
encoding();
} else if (typeof callback === "function") {
callback();
}
return true;
}) as typeof process.stdout.write;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Overriding process.stdout.write globally is a significant side effect that prevents tests using this helper from running in parallel. While the PR description acknowledges this risk, the helper itself does not enforce serial execution. If multiple tests invoke runInProcessDistCli concurrently, their output will be interleaved and captured incorrectly by all active runners.

References
  1. Identify performance bottlenecks and avoid unnecessary race conditions. Global side effects in a test runner lead to flaky tests when parallelized.

Comment thread tests/helpers/cliRunner.ts Outdated
Comment on lines +119 to +121
if (options.cwd) {
process.chdir(options.cwd);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using process.chdir is a global side effect that affects the entire process. Similar to the environment and output overrides, this makes the helper unsafe for parallel test execution. Any other test running in the same process will have its working directory changed unexpectedly.

References
  1. Ensure that invalid inputs or states are safely handled. Global state mutation without isolation is an unsafe pattern for unit/integration tests.

unbraind added 2 commits May 25, 2026 12:39
Bring CHANGELOG.md back in sync with the closed-item metadata updates captured in this PR so the pm-changelog CI --check gate passes deterministically.
Serialize in-process CLI test invocations, make global state restoration exception-safe, scope env overrides to explicit keys, and capture history-repair raw snapshots before parsing so concurrent updates cannot slip through the lock-window guard.
@unbraind unbraind merged commit bdbba9f into main May 25, 2026
13 checks passed
@unbraind unbraind deleted the chore/reliability-lock-window-inprocess-runner-foundation branch May 25, 2026 10:54
unbraind added a commit that referenced this pull request May 25, 2026
Persist final pm evidence after PR #67 merge by closing pm-uer0, releasing active claims, and documenting the remaining pm-7rlp continuation scope on main.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant