From fb8edb6ea4106ccc5c0c0a8005d765367d3f5e20 Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 29 Apr 2026 06:49:18 +0000 Subject: [PATCH] =?UTF-8?q?feat:=20two-screen=20collision=20check=20?= =?UTF-8?q?=E2=80=94=20conceptual=20+=20mechanical=20=E2=80=94=20for=20in-?= =?UTF-8?q?flight=20PRs?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the earlier 'max 2 in-progress' cap and the file-only screen, both of which missed real failure modes. Two distinct kinds of collision must be screened separately, because neither alone catches the other: - Conceptual collision: two PRs implement the same feature with different approaches (e.g. billing via framework X in pkg/x vs framework Y in pkg/y). No file overlap, but only one approach should win. Routes to needs-decision so the human picks. - Mechanical collision: two PRs touch the same logical area in the same file. Real merge conflict or invalidated assumptions. Defers until the other PR lands. Sequence: 1. Conceptual screen always runs first — cheap (~400 chars of title + body excerpt per open PR). If it triggers, the issue gets needs-decision and the agent picks the next one. 2. Mechanical screen runs only if conceptual passes. File-path overlap gates a scoped diff read (gh pr diff -- ) to distinguish 'same logical area' from 'different concerns in the same file'. Different concerns proceed (small rebase later); same area defers. Hot-file bail-out: >3 PRs overlap on the same file → default defer without reading diffs (reading 4+ diffs costs more than one cycle's wait). No numeric cap. If every ready issue trips a screen, fall through to (C) self-audit and cycle-exit — launcher restarts the agent. Updates STATUS.md Current focus block to surface both deferred categories separately. --- STATUS.md | 12 ++++++--- docs/unattended-rules.md | 57 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+), 3 deletions(-) diff --git a/STATUS.md b/STATUS.md index fb5f049..6a7267a 100644 --- a/STATUS.md +++ b/STATUS.md @@ -7,12 +7,18 @@ High-level, non-technical view of what's shipped, what's in flight, and what's n ## Current focus -> Max **2** work packages in flight at once. Finish before starting new. +> Independent work packages run in parallel. Two collision checks gate new work: **conceptual** (same feature, different approach — needs human pick) and **mechanical** (same code area, real rebase). Files don't have to overlap to collide. The agent never stops the unattended loop; it defers or hands off when collisions show up. -**Active work packages:** 0 / 2 +**Active work packages:** 0 - _none_ -**Blocked / needs-decision:** 0 +**Deferred — mechanical overlap:** 0 +- _none_ + +**Deferred — conceptual overlap (`needs-decision`):** 0 +- _none_ + +**Blocked — other `needs-decision`:** 0 - _none_ **Agent-proposed backlog:** 0 ideas filed, 0 started diff --git a/docs/unattended-rules.md b/docs/unattended-rules.md index 7de5e4a..28d4f88 100644 --- a/docs/unattended-rules.md +++ b/docs/unattended-rules.md @@ -107,6 +107,63 @@ You are a founding engineer with product authority. Ship working tested code. Ev Even if (A) finishes quickly and (B) has work waiting, do not start (B) in the same cycle. Exit. The launcher will start a fresh cycle for the next work package within seconds when the queue is non-empty (burst mode). Chaining bloats context and muddles cost attribution. +### Hard rule: no collisions between in-flight PRs + +Independent work packages run in parallel — that's how 24/7 mode keeps moving. There is **no numeric cap**. But two kinds of collision must be screened first, and they need different checks: + +- **Conceptual collision** — two PRs implement the same feature with different approaches (e.g. billing via framework X in `pkg/x` vs framework Y in `pkg/y`). Files don't overlap; only one approach should win. The human needs to pick. +- **Mechanical collision** — two PRs touch the same logical area in the same file. The second PR will hit a real merge conflict or assume code state the first PR changed. + +After writing your plan in (B) step 4, but **before any implementation**, run both screens. + +#### 1. Conceptual screen (always) + +```bash +gh pr list --state open --json number,title,body \ + --jq '.[] | "#\(.number) — \(.title)\n\(.body | tostring | .[0:400])\n---"' +``` + +Compare each open PR's title + body excerpt against your plan's `Problem` and `Approach`. You're looking for: same feature being built two ways, competing solutions to the same problem, duplicated effort. + +If conceptually overlapping with PR #X: + +- Remove `in-progress` from the issue. +- Add `needs-decision`. +- Comment: `"Conceptual overlap with PR #X — both implement ; this issue uses , PR #X uses . Human review needed to pick."` +- Pick the next `ready-for-agent` issue and re-run from step 1. + +Cost note: ~400 chars × open PRs. For 20 open PRs, ~2k tokens. Cheap. + +#### 2. Mechanical screen (only if step 1 passes) + +```bash +PLANNED_FILES=$(grep -E '^- ' plans/-.md | sed 's/^- //') +for pr in $(gh pr list --state open --json number --jq '.[].number'); do + CHANGED=$(gh pr diff "$pr" --name-only) + OVERLAP=$(comm -12 <(echo "$PLANNED_FILES" | sort -u) <(echo "$CHANGED" | sort -u)) + if [ -n "$OVERLAP" ]; then + echo "File overlap with PR #$pr: $OVERLAP" + fi +done +``` + +No file overlap → proceed. + +For each PR #X with file overlap, read the overlapping file's hunks only (`gh pr diff -- `), not the whole diff. Compare against the plan: + +- **Same logical area** (same function, same section, same change): defer with `"Defers to PR #X (mechanical overlap on in )"`. Remove `in-progress`, pick the next ready issue, re-run from step 1. +- **Different concerns in the same file** (one touches function A, the other function B; one edits a different section): proceed. Note the file-level overlap under `Risks` and expect a small rebase when PR #X lands. + +**Hot-file bail-out:** if step 2 reports file overlap with >3 open PRs on the same file, default to defer without reading any diffs. Reading 4+ diffs costs more than waiting one cycle. + +#### Empty queue + +If every ready issue trips at least one screen with some open PR, fall through to (C) self-audit and cycle-exit. The launcher restarts the agent automatically; the agent never stops the unattended loop on its own. + +#### Why two checks + +File overlap alone misses conceptual conflicts (different paths, same feature). Title/body alone misses mechanical conflicts (unrelated features that happen to touch the same module). Both signals together catch the failure modes that actually cost re-work. + ## Creative autonomy When queue is empty or between issues: