From dd6984850255bad9de16d7f20984091ea9d40653 Mon Sep 17 00:00:00 2001 From: Noah Horton Date: Wed, 8 Apr 2026 17:33:01 -0600 Subject: [PATCH 01/10] Add CLAUDE.md and initial .deepreview rule suite MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds project guidance for future Claude Code instances and configures four DeepWork review rules so this repo dogfoods the action it ships: - prompt_best_practices — reviews CLAUDE.md and prompts/review.txt against Anthropic prompt-engineering best practices. - update_action_surface_docs — keeps README.md and CLAUDE.md in sync with action.yml, prompts/review.txt, scripts/post-review-comments.py, and the example workflow. Instructions enumerate 12 high-risk drift points (inputs table, end-to-end flow steps, embedded example workflow, security claims, caching path, comment grouping, no-commit guarantees, state-files JSON schema, bot-identity literal, repo layout, etc). - python_code_review — reviews scripts/post-review-comments.py against conventions extracted from the existing code (.deepwork/review/python_conventions.md), with required DRY and comment-accuracy checks. - suggest_new_reviews — meta-rule that proposes new review rules per change. Also gitignores .deepwork/tmp/ since it's restored from GitHub Actions cache at runtime and regenerated by every /review run. Co-Authored-By: Claude Opus 4.6 (1M context) --- .deepreview | 223 +++++++++++++++++++++++++ .deepwork/review/python_conventions.md | 79 +++++++++ .gitignore | 4 + CLAUDE.md | 55 ++++++ 4 files changed, 361 insertions(+) create mode 100644 .deepreview create mode 100644 .deepwork/review/python_conventions.md create mode 100644 CLAUDE.md diff --git a/.deepreview b/.deepreview new file mode 100644 index 0000000..8d1d1d5 --- /dev/null +++ b/.deepreview @@ -0,0 +1,223 @@ +prompt_best_practices: + description: "Review prompt/instruction files for Anthropic prompt engineering best practices." + match: + include: + - "**/CLAUDE.md" + - "**/AGENTS.md" + - ".claude/**/*.md" + - "prompts/*.txt" + - "prompts/*.md" + - ".deepwork/review/*.md" + - ".deepwork/jobs/**/*.md" + review: + strategy: individual + instructions: | + Review this file as a prompt or instruction file, evaluating it + against Anthropic's prompt engineering best practices. + + For each issue found, report: + 1. Location (section or line) + 2. Severity (Critical / High / Medium / Low) + 3. Best practice violated + 4. Description of the issue + 5. Suggested improvement + + Check for: + - Clarity and specificity (concrete criteria vs vague language like + "do a good job", "be thorough") + - Structure and formatting (XML tags, headers, numbered lists for + distinct sections; logical separation of context, instructions, + constraints, output format) + - Role and context (enough context for the AI, explicit assumptions) + - Examples for complex/nuanced tasks (few-shot, edge cases) + - Output format specification (JSON, markdown, length constraints) + - Prompt anti-patterns (contradictions, instruction overload without + ranking, critical instructions buried in walls of text, relying on + the AI to infer important constraints) + - Variable/placeholder clarity (only when the prompt parameterizes + dynamic inputs) + + Use judgment proportional to the file's complexity. A short, focused + instruction for a simple task does not need few-shot examples or XML + tags. Do not flag issues for best practices that are irrelevant to the + file's purpose. + + Note: prompts/review.txt is the production prompt that this action + ships to Claude Code in CI. It is a CRITICAL file for this repo — + review it strictly, especially the "Automation Rules" section, the + change-tracking JSON contract, and the no-commit/no-push instructions. + +update_action_surface_docs: + description: "Keep README.md and CLAUDE.md in sync with action.yml, the production prompt, the comment-posting script, and the example workflow." + match: + include: + - "action.yml" + - "prompts/review.txt" + - "scripts/post-review-comments.py" + - ".github/workflows/example.yml" + - "README.md" + - "CLAUDE.md" + review: + strategy: matches_together + instructions: | + When source files change, check whether the following documentation + files need updating: + - README.md + - CLAUDE.md + + Read each documentation file and compare its content against the + changed source files. Flag any sections that are now outdated or + inaccurate. If a documentation file itself was changed, verify the + updates are correct and consistent with the source files. + + ## High-risk drift points in README.md + + 1. The "Inputs" table — every input listed must exist in action.yml + with the documented default. Defaults to verify: model, max_turns, + commit_message. If action.yml adds, removes, or renames an input, + the table is wrong. + + 2. The "How It Works" 5-step list — must mirror action.yml's composite + steps in order (cache restore, fetch base, run + claude-code-base-action, commit & push, post inline review comments). + + 3. The "Usage" section's example workflow — must stay aligned with + .github/workflows/example.yml. The concurrency group, the + `if: github.actor != 'deepwork-action[bot]'` self-trigger guard, + the permissions block, the checkout config, and the action + invocation must all match. If example.yml is updated, README's + snippet must be updated to match (or vice versa). + + 4. The "Security" section — claims about which underlying action is + used (`anthropics/claude-code-base-action`), the + `--dangerously-skip-permissions` flag, and the `deepwork-action[bot]` + identity. These must match the current action.yml. + + 5. The "Caching" section — claims caching of `.deepwork/tmp` keyed on + PR number. If action.yml's cache step changes its path or key + strategy, this section is wrong. + + 6. The "Review Comments" section — describes one inline comment per + changed file in the Files Changed tab. If + scripts/post-review-comments.py changes its comment grouping + strategy, this is wrong. + + 7. The auto-commit guarantees — README claims Claude does NOT + commit/push and that the workflow handles it. This is enforced by + prompts/review.txt's "Important" section. If review.txt's contract + changes, the README description may need to change too. + + ## High-risk drift points in CLAUDE.md + + 8. The "End-to-end flow" numbered list (steps 1-6) must mirror + action.yml's composite step list in order. CLAUDE.md names specific + arguments like `--dangerously-skip-permissions`, + `plugin_marketplaces`, and `prompts/review.txt`; verify each still + appears in action.yml. + + 9. The "State files crossing process boundaries" table must match the + actual JSON schema in prompts/review.txt + (`{"file", "line", "description", "reason"}`) AND the function names + `load_changes_by_file` and `build_comment_body` in + scripts/post-review-comments.py. If review.txt's schema changes or + either function is renamed, the table is wrong. + + 10. The "Self-trigger guard" section's literal `deepwork-action[bot]` + must match action.yml's commit step user.name AND the + `if: github.actor != 'deepwork-action[bot]'` guard in + .github/workflows/example.yml. All three must stay synchronized. + + 11. The "Repository layout" bullet list must reflect the actual + top-level directories. Adding a new top-level directory (e.g., + `lib/`, `tests/`) means CLAUDE.md's layout section is incomplete. + + 12. CLAUDE.md asserts the entire action is defined in "action.yml, one + prompt file, and one Python script". Adding a second prompt file + or script makes that claim wrong. + additional_context: + unchanged_matching_files: true + +python_code_review: + description: "Review Python files against project conventions, plus DRY and comment-accuracy checks." + match: + include: + - "**/*.py" + exclude: + - "**/__pycache__/**" + - "**/.venv/**" + review: + strategy: individual + instructions: | + Review this Python file against the project's conventions documented in + `.deepwork/review/python_conventions.md`. Read that file first. + + Check for: + - Module structure: shebang, module docstring, `from __future__ import + annotations`, import ordering (stdlib only when possible), section + banner comments for logical separation in single-file scripts. + - Naming and types: snake_case, built-in generic type hints + (`list[str]`, not `typing.List`), explicit return types on every + function, sparing use of `Any`. + - Functions and structure: small named top-level functions, `main()` + orchestration entry point, no gratuitous classes or dataclasses. + - I/O and shell commands: `subprocess.run(..., capture_output=True, + text=True)` via a `run()` wrapper, `pathlib.Path` for file I/O, + `os.environ.get("NAME", default)` not raw indexing for inputs that + might be missing. + - Error handling: narrow exception types (never bare `except:` or + `except Exception:`), warnings printed to `sys.stderr`. Non-fatal + failures should `sys.exit(0)` to avoid failing the calling CI step; + use `sys.exit(1)` only for actually broken state. + - Strings: f-strings only, parenthesized multi-line concatenation. + - Docstrings: triple-quoted, one-line summaries unless the function + does something subtle that justifies more. + + Additionally, ALWAYS check: + + - **DRY violations**: Is there duplicated logic or repeated patterns + that should be extracted into a shared function or helper? In a + single-file script the bar for extraction is higher than in a + library, but three near-identical blocks is still too many. + + - **Comment accuracy**: Are all comments, docstrings, and inline + documentation still accurate after the changes? Flag any comment + that describes behavior that no longer matches the code. This is + especially important for module-level docstrings that describe + what files the script reads/writes — those drift silently when + the file paths change. + + Output: PASS if the file is consistent with these conventions and no + DRY or comment-accuracy issues are found. Otherwise FAIL with a + bulleted list of specific issues, each tied to a line number. + +suggest_new_reviews: + description: "Analyze all changes and suggest new review rules that would catch issues going forward." + match: + include: + - "**/*" + exclude: + - ".github/**" + - ".deepwork/tmp/**" + review: + strategy: matches_together + instructions: | + Analyze the changeset to determine whether any new DeepWork review rules + should be added. + + 1. Call get_configured_reviews to see all currently configured review + rules. Also call get_named_schemas to see existing DeepSchemas. + Understand what's already covered. + 2. For each change, consider: + - Did this change introduce a type of issue a review rule could catch? + - Is there a pattern likely to recur? + - Would an existing rule benefit from a small scope expansion (e.g. + adding a glob to an existing include list)? + 3. Be extremely conservative. Only suggest rules that are: + - Extremely narrow (targets 1 specific file or small bounded set), OR + - A slight addition to an existing rule (no new agent spawned), OR + - Catches an issue likely to recur and worth the ongoing cost + 4. Write new rules directly to the appropriate `.deepreview` file. If a + rule needs a dedicated instruction file, create it in + `.deepwork/review/`. + 5. If no rules are warranted, say so. An empty suggestion list is valid. + Do not invent rules just to have output. diff --git a/.deepwork/review/python_conventions.md b/.deepwork/review/python_conventions.md new file mode 100644 index 0000000..43c326c --- /dev/null +++ b/.deepwork/review/python_conventions.md @@ -0,0 +1,79 @@ +# Python Conventions + +Conventions for Python code in this repository, observed from +`scripts/post-review-comments.py`. Keep this short and actionable — it's +a reference for reviewers, not an exhaustive style guide. + +## Module structure + +- Start with `#!/usr/bin/env python3` shebang for executable scripts. +- Module-level docstring (triple-quoted) immediately after the shebang, + describing what the script does, what it reads, what it writes, and how + it's invoked. Multi-line is fine. +- `from __future__ import annotations` near the top so type hints don't + evaluate at runtime. +- Imports: stdlib only when possible (this repo runs in CI with no extra + pip installs). Order: `__future__` → stdlib → third-party → local. +- Use `# ---------------------------------------------------------------------------` + banners with a `# Section Title` line to separate logical sections inside + a single-file script. This makes a flat script readable without splitting + it into modules. + +## Naming and types + +- `snake_case` for functions and variables; `PascalCase` for classes (none + in current code). +- Type-hint every function signature, including return types. Use the + built-in generic syntax (`list[str]`, `dict[str, Any]`) — not the + `typing.List` / `typing.Dict` legacy aliases. `from __future__ import + annotations` makes this work on older Pythons. +- Use `Any` (`from typing import Any`) sparingly — only when the structure + is genuinely dynamic (e.g., a JSON-decoded payload). + +## Functions and structure + +- Prefer small, named top-level functions over inline blocks. The current + script puts each step (`run`, `get_head_sha`, `get_diff`, `first_changed_line`, + `count_added_lines`, `load_changes_by_file`, `build_comment_body`, `main`) + in its own function. +- Keep `main()` as the orchestration entry point. Wire it via + `if __name__ == "__main__": main()`. +- Don't introduce dataclasses or classes unless there's actual state to + hold. The script-style this repo uses is dict-based. + +## I/O and external commands + +- For shell commands, use `subprocess.run(cmd, capture_output=True, text=True)` + via a small `run()` wrapper rather than `os.system` or `subprocess.call`. +- Read paths via `pathlib.Path`, not `open(string)`. +- Read environment variables with `os.environ.get("NAME", default)` — never + raw `os.environ["NAME"]` for inputs that might be missing. +- For non-fatal failures (e.g., the GitHub API call in + `post-review-comments.py`), print a warning to `sys.stderr` and + `sys.exit(0)` so the calling step doesn't fail. Hard-fail with `sys.exit(1)` + only for actual broken state. + +## Strings and formatting + +- f-strings for interpolation. No `%` formatting, no `.format()`. +- Multi-line string concatenation with parentheses, not `+`. + +## Error handling + +- Catch specific exception classes, not bare `except:` or `except Exception:`. + Existing code uses `except (json.JSONDecodeError, OSError) as exc:` — keep + exception lists narrow. +- Print warnings with the exception value: `print(f"Warning: ... {exc}", + file=sys.stderr)`. + +## Comments and docstrings + +- Function docstrings are triple-quoted, one-line summaries unless the + function does something subtle (e.g., the diff hunk parser in + `first_changed_line` has a multi-line docstring explaining the format + it parses). +- Inline comments explain *why*, not *what*. The current script has very + few inline comments — that's correct; the function names carry the + meaning. +- Section banners (`# ----` blocks) are the exception: they're structural, + not explanatory. diff --git a/.gitignore b/.gitignore index e30f246..ca698a9 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1,5 @@ scripts/__pycache__/ + +# DeepWork review state — restored from GitHub Actions cache at runtime, +# regenerated by every /review run. Not source. +.deepwork/tmp/ diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..f10fdf6 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,55 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## What this repo is + +A **composite GitHub Action** (not a JS/TS/Docker action) that runs Claude Code with the DeepWork plugin against a Pull Request, applies every review suggestion as code changes, auto-commits them to the PR branch, and posts inline PR review comments. There is no build system, no package manager, no test suite — the entire action is defined in `action.yml`, one prompt file, and one Python script. + +## Repository layout + +- `action.yml` — the composite action definition. All orchestration lives here. +- `prompts/review.txt` — the prompt fed to Claude Code via `claude-code-base-action`. Starts with `/review` to trigger the DeepWork plugin's review skill, then enforces CI-mode rules (no `AskUserQuestion`, apply every finding, write change log to `/tmp/deepwork_changes.json`). +- `scripts/post-review-comments.py` — runs **after** Claude finishes. Reads `/tmp/deepwork_changes.json`, diffs against the base branch, and POSTs a single PR review with one inline comment per changed file via `gh api`. +- `.github/workflows/example.yml` — reference workflow showing how downstream repos consume this action. Not a CI workflow for *this* repo. +- `.deepwork/` — DeepWork plugin's local state (`job.schema.json`, `tmp/status`). The `tmp/` subdirectory is what the action caches between runs. + +## End-to-end flow (read this before changing anything) + +The action's steps in `action.yml` form a pipeline that hands state between three different processes via well-known files. Breaking any link silently degrades the action — most failures here are silent because the Python step is non-fatal. + +1. **Checkout + cache restore** — the consuming workflow checks out the PR head branch with `fetch-depth: 0`. The action restores `.deepwork/tmp` from the GitHub Actions cache, keyed on PR number. This is how already-passed reviews skip re-running on subsequent commits (the major token saver called out in the README). +2. **Fetch base branch** — `git fetch origin --depth=1` so the diff in step 5 has something to compare against. Failure here is logged but non-fatal. +3. **Cleanup** — deletes any stale `/tmp/deepwork_changes.json` from a previous run on the same runner. +4. **Run Claude Code** — invokes `anthropics/claude-code-base-action@beta` with: + - `prompt_file: prompts/review.txt` + - `plugin_marketplaces: https://github.com/Unsupervisedcom/deepwork.git` + - `plugins: deepwork@deepwork-plugins` + - `claude_args: --dangerously-skip-permissions --model --max-turns ` + Claude is expected to (a) modify files in the working tree and (b) append entries to `/tmp/deepwork_changes.json` describing each change. Claude must **not** commit or push — that's step 5's job. +5. **Commit & push** — runs as identity `deepwork-action[bot] `. Detects "no changes" by checking `git diff`, `git diff --cached`, AND untracked files; sets `changes_made` output accordingly. Pushes via a token-rewritten remote URL. +6. **Post inline review comments** — only runs if `changes_made == 'true'`. Executes `scripts/post-review-comments.py`, which reads `/tmp/deepwork_changes.json`, generates per-file comment bodies (with a diff-stats fallback if a file isn't in the JSON), and POSTs a single review with `event: COMMENT` and one comment per file. + +## Self-trigger guard + +The example workflow uses `if: github.actor != 'deepwork-action[bot]'` at the **job level** to prevent the auto-fix commit from re-triggering the workflow. This guard is the only thing keeping the action from looping. Any change to the bot identity in step 5 of `action.yml` must be matched in the example workflow's `if` condition and in any documentation that references the actor name. + +## State files crossing process boundaries + +Three pieces of state flow between independent processes — keep them in sync when modifying any one of them: + +| File | Written by | Read by | Purpose | +|---|---|---|---| +| `/tmp/deepwork_changes.json` | Claude (per `prompts/review.txt`) | `scripts/post-review-comments.py` | Per-change descriptions for inline comments. Schema: `{"changes": [{"file", "line", "description", "reason"}]}`. The Python script is tolerant of missing/malformed entries and falls back to diff stats. | +| `.deepwork/tmp/` | DeepWork plugin (inside Claude Code) | GitHub Actions cache (next run) | Review pass/fail state per PR — enables incremental review across commits. | +| `/tmp/deepwork_review_payload.json` | `post-review-comments.py` | `gh api ... --input` | Transient; just the request body for the GitHub PR review API. | + +If you change the JSON schema in `prompts/review.txt`, you must update `load_changes_by_file` and `build_comment_body` in `scripts/post-review-comments.py` to match. The prompt is the contract. + +## Versioning the action + +This action is consumed via `Unsupervisedcom/deepwork-action@v1`. When making a release, the `v1` tag must be moved to the new commit (standard GitHub Actions major-version-tag convention). The README and example workflow both pin `@v1`. + +## Testing changes + +There is no local test harness. To validate changes end-to-end you must push a branch and open a PR in a repo that consumes this action (pinning the action to your branch via `Unsupervisedcom/deepwork-action@`). The Python script can be smoke-tested locally by setting `PR_NUMBER`, `GITHUB_REPOSITORY`, `GITHUB_BASE_REF` and running it inside a real git checkout, but it will only succeed in posting comments if `gh` is authenticated against a real PR. From d7cbe0ab38ec1ec2260da4060fb82c728278add6 Mon Sep 17 00:00:00 2001 From: Noah Horton Date: Wed, 8 Apr 2026 18:23:17 -0600 Subject: [PATCH 02/10] Clarify v1 floating-tag convention in CLAUDE.md Expand the "Versioning the action" section to explicitly document that v1 is a floating major-version tag that always points at the latest commit on main (standard GitHub Actions convention, e.g. actions/checkout@v4). Flags that release automation is planned and until then v1 must be force-moved manually on every merge to main. Co-Authored-By: Claude Opus 4.6 (1M context) --- CLAUDE.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/CLAUDE.md b/CLAUDE.md index f10fdf6..13eabf6 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -48,7 +48,18 @@ If you change the JSON schema in `prompts/review.txt`, you must update `load_cha ## Versioning the action -This action is consumed via `Unsupervisedcom/deepwork-action@v1`. When making a release, the `v1` tag must be moved to the new commit (standard GitHub Actions major-version-tag convention). The README and example workflow both pin `@v1`. +This action is consumed via `Unsupervisedcom/deepwork-action@v1`. The `v1` tag is a **floating major-version tag** that always points at the latest commit on `main` — standard GitHub Actions convention (see `actions/checkout@v4`, etc.). Consumers pin `@v1` and expect it to track the freshest `1.x.y` automatically. + +**This means every release (and currently every merge to main) requires force-moving `v1`**: + +```bash +git tag -f v1 origin/main +git push origin v1 --force +``` + +The README and example workflow both pin `@v1`, so this is the contract consumers rely on — don't change them to pin a specific `1.x.y` without also updating this section. + +Release automation is planned (see the release-automation work that should land after the initial `.deepreview` suite). Until that lands, the `v1` tag is moved manually on every merge to main. If you see `v1` lagging behind `main`, that's a bug — move it. ## Testing changes From a10cd5f09f3d1785974da4851af7eb1f06aa86f2 Mon Sep 17 00:00:00 2001 From: Noah Horton Date: Wed, 8 Apr 2026 18:40:48 -0600 Subject: [PATCH 03/10] Rewrite to use anthropics/claude-code-action@v1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The previous action.yml was wired to anthropics/claude-code-base-action@beta with plugin_marketplaces/plugins/claude_args inputs that never existed on that action — in any published release. At runtime those inputs were silently dropped, so the DeepWork plugin was never installed, --model and --dangerously-skip-permissions were never applied, and Claude was running the built-in /review slash command on default Sonnet with default permissions. The "review" was plain Claude freelancing with no .deepreview rule awareness. Switch to anthropics/claude-code-action@v1, which: - has real plugins: and plugin_marketplaces: inputs (newline-separated lists) documented in its action.yml - has claude_args: for passing --model / --max-turns / etc. - commits and pushes file edits back to the PR branch natively via use_commit_signing: false - provides mcp__github_inline_comment__create_inline_comment as a native MCP tool, replacing the custom post-review-comments.py - supports track_progress: true for a live progress comment on the PR - has a floating v1 tag maintained by upstream As a result this PR deletes a lot: - scripts/post-review-comments.py (202 lines of diff-and-post logic) - the Install uv composite step (dead code — nothing used uv) - the Fetch base branch for git diff step (claude-code-action handles diffing itself) - the Prepare review run step (no more /tmp/deepwork_changes.json contract) - the Commit and push changes step (claude-code-action commits natively) - the Post inline PR review comments step (native MCP tool) prompts/review.txt loses the /tmp/deepwork_changes.json tracking schema and gains instructions to use the native inline-comment MCP tool with confirmed: true. .github/workflows/example.yml simplifies: shallow fetch-depth: 1, id-token: write permission for OIDC, drops the misconfigured `if: github.actor != 'deepwork-action[bot]'` self-trigger guard (GITHUB_TOKEN-pushed commits don't retrigger workflows per GitHub's built-in rule — the guard was both wrong and unnecessary). CLAUDE.md and README.md are rewritten to describe the new thin architecture. The .deepreview rule update_action_surface_docs has its drift-check list updated to pin the new contract (plugins, plugin_marketplaces, bot_name, claude_args, etc.) instead of the deleted contract (load_changes_by_file, /tmp/deepwork_changes.json schema, etc). Co-Authored-By: Claude Opus 4.6 (1M context) --- .deepreview | 102 +++++++++------- .github/workflows/example.yml | 11 +- CLAUDE.md | 64 +++++----- README.md | 30 ++--- action.yml | 83 ++++--------- prompts/review.txt | 76 ++++++------ scripts/post-review-comments.py | 204 -------------------------------- 7 files changed, 176 insertions(+), 394 deletions(-) delete mode 100644 scripts/post-review-comments.py diff --git a/.deepreview b/.deepreview index 8d1d1d5..3d7de06 100644 --- a/.deepreview +++ b/.deepreview @@ -48,12 +48,11 @@ prompt_best_practices: change-tracking JSON contract, and the no-commit/no-push instructions. update_action_surface_docs: - description: "Keep README.md and CLAUDE.md in sync with action.yml, the production prompt, the comment-posting script, and the example workflow." + description: "Keep README.md and CLAUDE.md in sync with action.yml, the production prompt, and the example workflow." match: include: - "action.yml" - "prompts/review.txt" - - "scripts/post-review-comments.py" - ".github/workflows/example.yml" - "README.md" - "CLAUDE.md" @@ -77,63 +76,80 @@ update_action_surface_docs: commit_message. If action.yml adds, removes, or renames an input, the table is wrong. - 2. The "How It Works" 5-step list — must mirror action.yml's composite - steps in order (cache restore, fetch base, run - claude-code-base-action, commit & push, post inline review comments). + 2. The "How It Works" numbered list — must mirror action.yml's + actual behavior. action.yml currently delegates the review, + auto-commit, and inline-comment posting to + `anthropics/claude-code-action@v1`; if that underlying action is + swapped out or its inputs change (`plugins`, + `plugin_marketplaces`, `track_progress`, `use_commit_signing`, + `bot_name`, `claude_args`), this list is wrong. 3. The "Usage" section's example workflow — must stay aligned with .github/workflows/example.yml. The concurrency group, the - `if: github.actor != 'deepwork-action[bot]'` self-trigger guard, - the permissions block, the checkout config, and the action - invocation must all match. If example.yml is updated, README's - snippet must be updated to match (or vice versa). + permissions block (including `id-token: write`), the checkout + config, and the action invocation must all match. If example.yml + is updated, README's snippet must be updated to match (or vice + versa). 4. The "Security" section — claims about which underlying action is - used (`anthropics/claude-code-base-action`), the - `--dangerously-skip-permissions` flag, and the `deepwork-action[bot]` - identity. These must match the current action.yml. + used (`anthropics/claude-code-action@v1`) and the + `deepwork-action[bot]` identity. These must match the current + action.yml's `uses:` line and `bot_name:` input. 5. The "Caching" section — claims caching of `.deepwork/tmp` keyed on PR number. If action.yml's cache step changes its path or key strategy, this section is wrong. - 6. The "Review Comments" section — describes one inline comment per - changed file in the Files Changed tab. If - scripts/post-review-comments.py changes its comment grouping - strategy, this is wrong. + 6. The "Review Comments" section — describes inline comments posted + via the native `mcp__github_inline_comment__create_inline_comment` + MCP tool provided by `anthropics/claude-code-action`. If the + upstream action changes the tool name or if we switch how comments + are posted, this is wrong. - 7. The auto-commit guarantees — README claims Claude does NOT - commit/push and that the workflow handles it. This is enforced by - prompts/review.txt's "Important" section. If review.txt's contract - changes, the README description may need to change too. + 7. The "no self-trigger guard needed" claim — relies on GitHub's + built-in rule that GITHUB_TOKEN-pushed commits do not retrigger + workflows. If action.yml starts pushing via a PAT or GitHub App + token instead, the README claim becomes false and the example + workflow will need a guard added back. ## High-risk drift points in CLAUDE.md - 8. The "End-to-end flow" numbered list (steps 1-6) must mirror - action.yml's composite step list in order. CLAUDE.md names specific - arguments like `--dangerously-skip-permissions`, - `plugin_marketplaces`, and `prompts/review.txt`; verify each still - appears in action.yml. - - 9. The "State files crossing process boundaries" table must match the - actual JSON schema in prompts/review.txt - (`{"file", "line", "description", "reason"}`) AND the function names - `load_changes_by_file` and `build_comment_body` in - scripts/post-review-comments.py. If review.txt's schema changes or - either function is renamed, the table is wrong. - - 10. The "Self-trigger guard" section's literal `deepwork-action[bot]` - must match action.yml's commit step user.name AND the - `if: github.actor != 'deepwork-action[bot]'` guard in - .github/workflows/example.yml. All three must stay synchronized. + 8. The "End-to-end flow" numbered list must mirror action.yml's + actual composite steps in order. CLAUDE.md names specific inputs + passed to `anthropics/claude-code-action@v1` + (`plugin_marketplaces`, `plugins`, `track_progress`, + `use_commit_signing`, `bot_name`, `claude_args`); verify each is + still set in action.yml. + + 9. The "What used to be here and isn't anymore" section — describes + what was deleted in the rewrite (the 7-step composite, + `scripts/post-review-comments.py`, `/tmp/deepwork_changes.json`, + `claude-code-base-action@beta`). This section should stay stable + unless someone adds those things back. If any of those deleted + artifacts reappear in the repo, the section is a LIE and must be + updated. + + 10. The "Self-trigger guard" section asserts that GITHUB_TOKEN pushes + don't retrigger. If action.yml's push path ever switches to a + non-GITHUB_TOKEN credential (PAT, GitHub App token), this + assertion becomes false. 11. The "Repository layout" bullet list must reflect the actual - top-level directories. Adding a new top-level directory (e.g., - `lib/`, `tests/`) means CLAUDE.md's layout section is incomplete. - - 12. CLAUDE.md asserts the entire action is defined in "action.yml, one - prompt file, and one Python script". Adding a second prompt file - or script makes that claim wrong. + top-level files and directories. Verify that action.yml, + prompts/review.txt, .github/workflows/example.yml, .deepwork/ + (and in particular .deepwork/review/ but NOT .deepwork/tmp/), + and .deepreview are all still present and still described + correctly. If `scripts/` reappears, or a new top-level directory + is added, this list is incomplete. + + 12. The "prompt contract" section enumerates 4 essential guarantees + about prompts/review.txt. Verify that review.txt still + (a) starts with `/review`, (b) enforces CI-mode rules including + never using AskUserQuestion, (c) instructs use of + `mcp__github_inline_comment__create_inline_comment` with + `confirmed: true` for per-change comments, (d) tells Claude not + to run `git commit`/`git push` directly. If review.txt diverges + from any of these, the CLAUDE.md section is wrong. additional_context: unchanged_matching_files: true diff --git a/.github/workflows/example.yml b/.github/workflows/example.yml index 5124f58..c72c7ea 100644 --- a/.github/workflows/example.yml +++ b/.github/workflows/example.yml @@ -12,21 +12,18 @@ concurrency: jobs: deepwork-review: runs-on: ubuntu-latest - # Don't re-run on commits pushed by the action itself - if: github.actor != 'deepwork-action[bot]' # Required permissions permissions: contents: write # push auto-fix commits to the PR branch - pull-requests: write # post inline PR review comments + pull-requests: write # post inline PR review comments and progress tracker + id-token: write # OIDC for anthropics/claude-code-action steps: - name: Checkout PR branch uses: actions/checkout@v4 with: - # Fetch full history so DeepWork can diff against the base branch - # Set too `100` or something high but safe if you have a huge git history - fetch-depth: 0 - # Use the merge ref so we operate on the PR's head commit + # claude-code-action handles PR diffing itself; shallow checkout is fine. + fetch-depth: 1 ref: ${{ github.event.pull_request.head.ref }} token: ${{ secrets.GITHUB_TOKEN }} diff --git a/CLAUDE.md b/CLAUDE.md index 13eabf6..3b8b9e4 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -4,47 +4,57 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co ## What this repo is -A **composite GitHub Action** (not a JS/TS/Docker action) that runs Claude Code with the DeepWork plugin against a Pull Request, applies every review suggestion as code changes, auto-commits them to the PR branch, and posts inline PR review comments. There is no build system, no package manager, no test suite — the entire action is defined in `action.yml`, one prompt file, and one Python script. +A **composite GitHub Action** (not a JS/TS/Docker action) that delegates most of the heavy lifting to [`anthropics/claude-code-action@v1`](https://github.com/anthropics/claude-code-action). It installs the DeepWork plugin, runs `/review` against the PR, applies every finding as real file edits, and uses the upstream action's native machinery to auto-commit changes and post inline PR comments. There is no build system, no package manager, no test suite. The action is defined by `action.yml` and a single prompt file. ## Repository layout -- `action.yml` — the composite action definition. All orchestration lives here. -- `prompts/review.txt` — the prompt fed to Claude Code via `claude-code-base-action`. Starts with `/review` to trigger the DeepWork plugin's review skill, then enforces CI-mode rules (no `AskUserQuestion`, apply every finding, write change log to `/tmp/deepwork_changes.json`). -- `scripts/post-review-comments.py` — runs **after** Claude finishes. Reads `/tmp/deepwork_changes.json`, diffs against the base branch, and POSTs a single PR review with one inline comment per changed file via `gh api`. +- `action.yml` — the composite action definition. Three composite steps: cache restore, load the prompt file into a step output, invoke `anthropics/claude-code-action@v1`. +- `prompts/review.txt` — the prompt fed to Claude. Starts with `/review` to trigger the DeepWork plugin's review skill, then enforces CI-mode rules (no `AskUserQuestion`, apply every finding, iterate until clean, post inline comments via `mcp__github_inline_comment__create_inline_comment`). - `.github/workflows/example.yml` — reference workflow showing how downstream repos consume this action. Not a CI workflow for *this* repo. -- `.deepwork/` — DeepWork plugin's local state (`job.schema.json`, `tmp/status`). The `tmp/` subdirectory is what the action caches between runs. +- `.deepwork/` — DeepWork plugin's local state. Only `.deepwork/review/` is source; `.deepwork/tmp/` is the cache directory (gitignored) restored from GitHub Actions cache at runtime. +- `.deepreview` — this repo's own review rules, so the action dogfoods itself. -## End-to-end flow (read this before changing anything) +## End-to-end flow -The action's steps in `action.yml` form a pipeline that hands state between three different processes via well-known files. Breaking any link silently degrades the action — most failures here are silent because the Python step is non-fatal. +The action is now thin. In order, `action.yml` runs: -1. **Checkout + cache restore** — the consuming workflow checks out the PR head branch with `fetch-depth: 0`. The action restores `.deepwork/tmp` from the GitHub Actions cache, keyed on PR number. This is how already-passed reviews skip re-running on subsequent commits (the major token saver called out in the README). -2. **Fetch base branch** — `git fetch origin --depth=1` so the diff in step 5 has something to compare against. Failure here is logged but non-fatal. -3. **Cleanup** — deletes any stale `/tmp/deepwork_changes.json` from a previous run on the same runner. -4. **Run Claude Code** — invokes `anthropics/claude-code-base-action@beta` with: - - `prompt_file: prompts/review.txt` +1. **Restore DeepWork review cache** via `actions/cache@v4`. Path is `.deepwork/tmp`, keyed on PR number + run id. The DeepWork plugin uses this directory to remember which reviews have already passed on a PR, so subsequent commits skip re-running already-passed checks — the main token-cost saver. +2. **Load review prompt** into a step output. Reads `prompts/review.txt` via bash heredoc into `${{ steps.load_prompt.outputs.content }}`. This exists because `anthropics/claude-code-action` only has a `prompt` input (no `prompt_file`), so we have to inline the text via an output. +3. **Run DeepWork Review** — invokes `anthropics/claude-code-action@v1` with: - `plugin_marketplaces: https://github.com/Unsupervisedcom/deepwork.git` - `plugins: deepwork@deepwork-plugins` - - `claude_args: --dangerously-skip-permissions --model --max-turns ` - Claude is expected to (a) modify files in the working tree and (b) append entries to `/tmp/deepwork_changes.json` describing each change. Claude must **not** commit or push — that's step 5's job. -5. **Commit & push** — runs as identity `deepwork-action[bot] `. Detects "no changes" by checking `git diff`, `git diff --cached`, AND untracked files; sets `changes_made` output accordingly. Pushes via a token-rewritten remote URL. -6. **Post inline review comments** — only runs if `changes_made == 'true'`. Executes `scripts/post-review-comments.py`, which reads `/tmp/deepwork_changes.json`, generates per-file comment bodies (with a diff-stats fallback if a file isn't in the JSON), and POSTs a single review with `event: COMMENT` and one comment per file. + - `prompt:` the review.txt content plus a header with the repo and PR number + - `claude_args: --model --max-turns ` + - `track_progress: true` → live "Claude Code is reviewing..." comment on the PR + - `use_commit_signing: false` → Claude uses plain `git commit` / `git push` for auto-fixes + - `bot_name: 'deepwork-action[bot]'` + +The upstream `claude-code-action` then: +- Installs the DeepWork plugin from the marketplace URL. +- Spawns Claude Code, which runs `/review`, reads `.deepreview` rules, dispatches reviewers in parallel, applies findings as real file edits. +- Commits and pushes those edits to the PR branch automatically (no custom commit step here). +- Posts inline PR comments for each change via the native `mcp__github_inline_comment__create_inline_comment` MCP tool. + +## What used to be here and isn't anymore + +Before the rewrite to `claude-code-action@v1`, `action.yml` had seven composite steps: install uv, restore cache, fetch base branch, prepare review run (`rm -f /tmp/deepwork_changes.json`), run `claude-code-base-action@beta`, commit & push, and a custom `scripts/post-review-comments.py` that diffed the PR and posted inline comments. The old commit `bc66f07 "proper plugin install"` configured the base action with `plugin_marketplaces`/`plugins`/`claude_args` inputs that never existed on `claude-code-base-action` in any published release — the plugin never actually installed in CI, and Claude was running the built-in `/review` slash command on default Sonnet with default permissions. Switching to `anthropics/claude-code-action@v1` (which has real `plugins`/`plugin_marketplaces`/`claude_args` inputs) let us delete all of that. If you see any stale references to `/tmp/deepwork_changes.json`, the custom commit step, `scripts/post-review-comments.py`, or `claude-code-base-action` in docs or code, they are leftovers — delete them. ## Self-trigger guard -The example workflow uses `if: github.actor != 'deepwork-action[bot]'` at the **job level** to prevent the auto-fix commit from re-triggering the workflow. This guard is the only thing keeping the action from looping. Any change to the bot identity in step 5 of `action.yml` must be matched in the example workflow's `if` condition and in any documentation that references the actor name. +There isn't one, and there doesn't need to be. GitHub Actions' built-in rule: **events triggered by the default `GITHUB_TOKEN` do not create new workflow runs.** Since `claude-code-action` pushes auto-fix commits using the `GITHUB_TOKEN` we pass in, those pushes do not re-trigger the `pull_request` workflow. No `if: github.actor != '...'` guard required. The example workflow previously had one, but it was misconfigured (checked for `deepwork-action[bot]` when the actual actor for GITHUB_TOKEN pushes is `github-actions[bot]`) and unnecessary to begin with. + +If you ever switch the push path to use a Personal Access Token or a GitHub App token instead of `GITHUB_TOKEN`, the re-trigger protection disappears and you will need an explicit guard matching whichever bot name those credentials resolve to. -## State files crossing process boundaries +## The prompt contract -Three pieces of state flow between independent processes — keep them in sync when modifying any one of them: +`prompts/review.txt` is the production prompt that ships to Claude in CI. Treat it as a critical file — review it strictly whenever it changes. Its essential guarantees: -| File | Written by | Read by | Purpose | -|---|---|---|---| -| `/tmp/deepwork_changes.json` | Claude (per `prompts/review.txt`) | `scripts/post-review-comments.py` | Per-change descriptions for inline comments. Schema: `{"changes": [{"file", "line", "description", "reason"}]}`. The Python script is tolerant of missing/malformed entries and falls back to diff stats. | -| `.deepwork/tmp/` | DeepWork plugin (inside Claude Code) | GitHub Actions cache (next run) | Review pass/fail state per PR — enables incremental review across commits. | -| `/tmp/deepwork_review_payload.json` | `post-review-comments.py` | `gh api ... --input` | Transient; just the request body for the GitHub PR review API. | +1. Claude runs `/review` (the DeepWork plugin's skill, not Claude Code's built-in). +2. CI mode rules: never `AskUserQuestion`, apply every finding autonomously, iterate until clean, emit "No review rules configured." and stop if no `.deepreview` rules exist. +3. For each substantive change, post an inline PR comment via `mcp__github_inline_comment__create_inline_comment` with `confirmed: true`, anchored to the changed line, describing what and why. +4. Do not run `git commit` / `git push` — the upstream action handles it. -If you change the JSON schema in `prompts/review.txt`, you must update `load_changes_by_file` and `build_comment_body` in `scripts/post-review-comments.py` to match. The prompt is the contract. +If you change the prompt, update the drift checks in `.deepreview`'s `update_action_surface_docs` rule and this CLAUDE.md section to match. ## Versioning the action @@ -59,8 +69,8 @@ git push origin v1 --force The README and example workflow both pin `@v1`, so this is the contract consumers rely on — don't change them to pin a specific `1.x.y` without also updating this section. -Release automation is planned (see the release-automation work that should land after the initial `.deepreview` suite). Until that lands, the `v1` tag is moved manually on every merge to main. If you see `v1` lagging behind `main`, that's a bug — move it. +Release automation is planned. Until it lands, the `v1` tag is moved manually on every merge to main. If you see `v1` lagging behind `main`, that's a bug — move it. ## Testing changes -There is no local test harness. To validate changes end-to-end you must push a branch and open a PR in a repo that consumes this action (pinning the action to your branch via `Unsupervisedcom/deepwork-action@`). The Python script can be smoke-tested locally by setting `PR_NUMBER`, `GITHUB_REPOSITORY`, `GITHUB_BASE_REF` and running it inside a real git checkout, but it will only succeed in posting comments if `gh` is authenticated against a real PR. +There is no local test harness. To validate changes end-to-end you must push a branch and open a PR in a repo that consumes this action (pinning to your branch via `Unsupervisedcom/deepwork-action@`). This repo dogfoods itself via `.github/workflows/example.yml`, so any PR opened against this repo also exercises the action on its own changes. diff --git a/README.md b/README.md index e1e27a2..6c9f180 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,13 @@ # deepwork-action -A prebuilt GitHub Action that runs [Claude Code](https://docs.anthropic.com/en/docs/claude-code) on a Pull Request with the [DeepWork](https://github.com/Unsupervisedcom/deepwork) plugin installed, triggers the `/review` skill, auto-commits all review-driven improvements back to the PR branch, and posts inline PR review comments explaining each change. +A prebuilt GitHub Action that runs [Claude Code](https://docs.anthropic.com/en/docs/claude-code) on a Pull Request with the [DeepWork](https://github.com/Unsupervisedcom/deepwork) plugin installed, triggers the `/review` skill, auto-commits every review-driven improvement back to the PR branch, and posts inline PR review comments explaining each change. ## How It Works -1. **DeepWork plugin install** — The action installs the DeepWork plugin from the marketplace using Claude Code's native plugin system, loading all review skills, hooks, and MCP server configuration automatically. -2. **DeepWork review** — Claude Code runs the `/review` skill, which reads your `.deepreview` config files to discover review rules, diffs the PR branch, and dispatches parallel review agents scoped to exactly the right files. -3. **Apply changes** — Claude applies every suggested improvement (bugs, style, performance, security, docs, refactoring) without asking for confirmation. -4. **Auto-commit** — All file changes are committed back to the PR branch under the `deepwork-action[bot]` identity. -5. **Inline PR comments** — A GitHub PR review is posted with one inline comment per changed file, describing what was changed and why, so your team can review each improvement. +1. **Cache restore** — Restores the DeepWork plugin's per-PR review state from GitHub Actions cache so already-passed reviews are not re-run on subsequent commits. +2. **DeepWork review via Claude Code Action** — Invokes [`anthropics/claude-code-action@v1`](https://github.com/anthropics/claude-code-action) with `plugins: deepwork@deepwork-plugins` and `plugin_marketplaces: https://github.com/Unsupervisedcom/deepwork.git`, then runs the `/review` skill against the PR. The skill reads your `.deepreview` config files, dispatches parallel review agents scoped to exactly the right files, and applies every finding. +3. **Auto-commit** — `claude-code-action` commits Claude's file changes back to the PR branch automatically under the `deepwork-action[bot]` identity. +4. **Inline PR comments** — Claude posts one inline PR comment per substantive change via the native `mcp__github_inline_comment__create_inline_comment` tool. A live progress comment (`track_progress: true`) tracks the review as it runs. ## Prerequisites @@ -33,17 +32,16 @@ concurrency: jobs: deepwork-review: runs-on: ubuntu-latest - # Don't re-run on commits pushed by the action itself - if: github.actor != 'deepwork-action[bot]' permissions: contents: write # push auto-fix commits to the PR branch - pull-requests: write # post inline PR review comments + pull-requests: write # post inline PR review comments and progress tracker + id-token: write # OIDC for anthropics/claude-code-action steps: - name: Checkout PR branch uses: actions/checkout@v4 with: - fetch-depth: 0 + fetch-depth: 1 ref: ${{ github.event.pull_request.head.ref }} token: ${{ secrets.GITHUB_TOKEN }} @@ -54,6 +52,8 @@ jobs: github_token: ${{ secrets.GITHUB_TOKEN }} ``` +No self-trigger guard is needed: commits pushed by the action via `GITHUB_TOKEN` do not re-trigger `pull_request` workflow runs (GitHub's built-in rule). + ## Inputs | Input | Required | Default | Description | @@ -79,17 +79,17 @@ If no `.deepreview` rules are configured in the repository, the action exits cle ## Review Comments -After pushing the auto-fix commit, the action posts a GitHub PR review with inline comments on each changed file. The comments appear in the **Files Changed** tab and describe what was changed and why, so your team can accept, request modifications, or revert individual changes as needed. +Each substantive change Claude makes is explained by an inline PR comment anchored to the changed line, posted via the native GitHub inline-comment MCP tool provided by `anthropics/claude-code-action`. Comments appear in the **Files Changed** tab so your team can accept, request modifications, or revert individual changes as needed. ## Caching -Review state is cached per PR using GitHub Actions cache, keyed on the PR number. This means already-passed reviews are not re-run when you push new commits to the same PR — only code that has changed since the last review is re-evaluated. THIS IS A MAJOR TOKEN COST SAVER!!! +Review state is cached per PR in `.deepwork/tmp` using GitHub Actions cache, keyed on the PR number. Already-passed reviews are not re-run when you push new commits to the same PR — only code that has changed since the last review is re-evaluated. **This is a major token cost saver.** ## Security -- Claude Code is installed and run via the official [`anthropics/claude-code-base-action`](https://github.com/anthropics/claude-code-base-action). -- The action runs with `--dangerously-skip-permissions` in a sandboxed GitHub Actions runner. It has no access to secrets beyond what you explicitly provide. -- Auto-fix commits are pushed under the `deepwork-action[bot]` identity. The example workflow includes `if: github.actor != 'deepwork-action[bot]'` at the job level so the action never triggers itself recursively. +- Claude Code runs via the official [`anthropics/claude-code-action@v1`](https://github.com/anthropics/claude-code-action). +- Auto-fix commits are pushed under the `deepwork-action[bot]` identity. Since those commits are pushed with `GITHUB_TOKEN`, they do not re-trigger the workflow (GitHub's built-in rule). +- The action runs in a sandboxed GitHub Actions runner with only the secrets you explicitly pass through. ## License diff --git a/action.yml b/action.yml index f81fca9..f121a28 100644 --- a/action.yml +++ b/action.yml @@ -29,11 +29,6 @@ inputs: runs: using: 'composite' steps: - - name: Install uv - uses: astral-sh/setup-uv@v5 - with: - version: 'latest' - - name: Restore DeepWork review cache uses: actions/cache@v4 with: @@ -45,70 +40,32 @@ runs: restore-keys: | deepwork-review-pr-${{ github.event.pull_request.number }}- - - name: Fetch base branch for git diff - shell: bash - run: | - BASE_REF="${{ github.event.pull_request.base.ref }}" - if [ -n "$BASE_REF" ]; then - git fetch origin "$BASE_REF" --depth=1 || \ - echo "Warning: could not fetch base branch '$BASE_REF'; diff detection may be incomplete" - fi - - - name: Prepare review run + - name: Load review prompt + id: load_prompt shell: bash run: | - # Clean up any leftover changes file from a previous run. - rm -f /tmp/deepwork_changes.json + { + echo 'content<> "$GITHUB_OUTPUT" - - name: Run DeepWork review with Claude Code - uses: anthropics/claude-code-base-action@beta - env: - GH_TOKEN: ${{ inputs.github_token }} + - name: Run DeepWork Review + uses: anthropics/claude-code-action@v1 with: anthropic_api_key: ${{ inputs.anthropic_api_key }} - prompt_file: ${{ github.action_path }}/prompts/review.txt + github_token: ${{ inputs.github_token }} plugin_marketplaces: https://github.com/Unsupervisedcom/deepwork.git plugins: deepwork@deepwork-plugins - claude_args: >- - --dangerously-skip-permissions + track_progress: true + use_commit_signing: false + bot_name: 'deepwork-action[bot]' + prompt: | + REPO: ${{ github.repository }} + PR NUMBER: ${{ github.event.pull_request.number }} + COMMIT MESSAGE FOR AUTO-FIXES: ${{ inputs.commit_message }} + + ${{ steps.load_prompt.outputs.content }} + claude_args: | --model ${{ inputs.model }} --max-turns ${{ inputs.max_turns }} - - - name: Commit and push changes - id: commit - shell: bash - env: - GITHUB_TOKEN: ${{ inputs.github_token }} - run: | - git config user.name "deepwork-action[bot]" - git config user.email "deepwork-action[bot]@users.noreply.github.com" - - # Check for any modified, added, or deleted tracked files - if git diff --quiet && git diff --cached --quiet \ - && [ -z "$(git ls-files --others --exclude-standard)" ]; then - echo "No changes to commit." - echo "changes_made=false" >> "$GITHUB_OUTPUT" - exit 0 - fi - - echo "changes_made=true" >> "$GITHUB_OUTPUT" - - git add -A - git commit -m "${{ inputs.commit_message }}" - - # Authenticate push via the provided token. - REPO="${{ github.repository }}" - git remote set-url origin \ - "https://x-access-token:${GITHUB_TOKEN}@github.com/${REPO}.git" - git push - - - name: Post inline PR review comments - if: steps.commit.outputs.changes_made == 'true' && github.event.pull_request.number != '' - shell: bash - env: - GH_TOKEN: ${{ inputs.github_token }} - PR_NUMBER: ${{ github.event.pull_request.number }} - GITHUB_REPOSITORY: ${{ github.repository }} - GITHUB_BASE_REF: ${{ github.event.pull_request.base.ref }} - run: | - python3 "${{ github.action_path }}/scripts/post-review-comments.py" diff --git a/prompts/review.txt b/prompts/review.txt index 609f093..1db138b 100644 --- a/prompts/review.txt +++ b/prompts/review.txt @@ -1,38 +1,44 @@ /review You are running in a fully automated CI environment on a GitHub Pull Request. -There is NO human watching this session. Follow these critical rules at all times: - -## Automation Rules (MANDATORY) - -1. **NEVER use AskUserQuestion** — you are in CI mode; make every decision autonomously. -2. **Make ALL changes** suggested by the review findings — not just "obviously good" ones. - Apply every finding: bugs, style, performance, security, documentation, and refactoring. - When a finding offers multiple approaches, choose the best one yourself. -3. **Iterate** — after making changes, re-run the review until it comes back clean. -4. **If no `.deepreview` rules are configured** — output the message "No review rules configured." - and stop. Do not attempt to configure rules; that is the repository owner's responsibility. - -## Change Tracking (REQUIRED) - -For every file you modify, append an entry to `/tmp/deepwork_changes.json`. -Create the file with `{"changes": []}` if it does not yet exist. - -Each entry must follow this exact JSON structure: - -```json -{ - "file": "relative/path/to/changed/file", - "line": , - "description": "One-sentence description of what was changed", - "reason": "The review finding that prompted this change" -} -``` - -Write the final file when all changes are complete and the review passes. - -## Important - -- You have full permission to edit, create, and delete files in this repository. -- Do NOT commit changes yourself — the CI workflow handles git commit and push. -- Do NOT push changes yourself — the CI workflow handles git commit and push. +There is NO human watching this session. Follow these critical rules at all times. + +## Automation rules (MANDATORY) + +1. **NEVER use AskUserQuestion** — you are in CI mode; make every decision + autonomously based on the .deepreview rules, the file contents, and the + project's conventions. +2. **Apply every finding** the review produces — bugs, style, performance, + security, documentation, and refactoring. Do not cherry-pick only the + "obviously good" ones. When a finding offers multiple valid approaches, + choose the best one yourself. +3. **Iterate** — after applying changes, re-run the review until it comes + back clean. +4. **If no `.deepreview` rules are configured** — output the message + "No review rules configured." and stop. Do not attempt to configure + rules; that is the repository owner's responsibility. + +## Posting inline PR comments + +For every substantive change you make, post an inline PR comment via the +`mcp__github_inline_comment__create_inline_comment` MCP tool, with +`confirmed: true`, anchored to the most relevant line of the new file +content. The comment body should describe: + +- **What** you changed (one sentence) +- **Why** — the specific .deepreview finding or rule that prompted it + +One inline comment per distinct change is ideal. Group mechanical repeats +(e.g., renaming the same symbol in 10 places) under a single comment on +the most representative line. + +Do NOT post free-form review text as chat messages or top-level PR +comments for individual findings — use inline comments so each change +is anchored to the code it describes. + +## Commits and pushes + +The wrapping GitHub Action commits and pushes your file changes back to +the PR branch automatically. You do NOT need to run `git commit`, +`git push`, or any `git` command yourself. Just edit the files; the +action takes care of the rest. diff --git a/scripts/post-review-comments.py b/scripts/post-review-comments.py deleted file mode 100644 index 7308abf..0000000 --- a/scripts/post-review-comments.py +++ /dev/null @@ -1,204 +0,0 @@ -#!/usr/bin/env python3 -""" -Post inline PR review comments for each file changed by the DeepWork review. - -Reads /tmp/deepwork_changes.json (written by Claude) for per-change descriptions. -Falls back to a diff-based summary when the file is absent or an entry is missing. -Posts a single GitHub PR review with one inline comment per changed file. -""" - -from __future__ import annotations - -import json -import os -import re -import subprocess -import sys -from pathlib import Path -from typing import Any - - -# --------------------------------------------------------------------------- -# Helpers -# --------------------------------------------------------------------------- - -def run(cmd: list[str], **kwargs) -> subprocess.CompletedProcess: - return subprocess.run(cmd, capture_output=True, text=True, **kwargs) - - -def get_head_sha() -> str: - return run(["git", "rev-parse", "HEAD"]).stdout.strip() - - -def get_changed_files(base_ref: str) -> list[str]: - """Return files changed between the base branch and HEAD.""" - # Try the exact remote ref first (available when the base branch was fetched). - for ref in (f"origin/{base_ref}", base_ref, "HEAD~1"): - result = run(["git", "diff", ref, "HEAD", "--name-only", "--diff-filter=ACMR"]) - if result.returncode == 0 and result.stdout.strip(): - return [f for f in result.stdout.splitlines() if f.strip()] - return [] - - -def get_diff(file_path: str, base_ref: str) -> str: - for ref in (f"origin/{base_ref}", base_ref, "HEAD~1"): - result = run(["git", "diff", ref, "HEAD", "--", file_path]) - if result.returncode == 0 and result.stdout.strip(): - return result.stdout - return "" - - -def first_changed_line(diff: str) -> int: - """ - Return the line number (in the new file) of the first added line. - Parses unified diff hunk headers: @@ -old +new,count @@ - """ - current_new = 0 - for line in diff.splitlines(): - if line.startswith("@@"): - m = re.search(r"\+(\d+)", line) - if m: - current_new = int(m.group(1)) - elif line.startswith("+") and not line.startswith("+++"): - return max(current_new, 1) - elif not line.startswith("-") and not line.startswith("\\"): - current_new += 1 - return 1 - - -def count_added_lines(diff: str) -> int: - return sum( - 1 - for line in diff.splitlines() - if line.startswith("+") and not line.startswith("+++") - ) - - -# --------------------------------------------------------------------------- -# Load Claude's change summary (optional) -# --------------------------------------------------------------------------- - -def load_changes_by_file() -> dict[str, list[dict[str, Any]]]: - changes_path = Path("/tmp/deepwork_changes.json") - if not changes_path.exists(): - return {} - try: - data = json.loads(changes_path.read_text()) - by_file: dict[str, list[dict]] = {} - for entry in data.get("changes", []): - fp = entry.get("file", "").lstrip("./") - by_file.setdefault(fp, []).append(entry) - return by_file - except (json.JSONDecodeError, OSError) as exc: - print(f"Warning: could not parse /tmp/deepwork_changes.json: {exc}", file=sys.stderr) - return {} - - -# --------------------------------------------------------------------------- -# Build comment body for a file -# --------------------------------------------------------------------------- - -def build_comment_body( - file_path: str, - diff: str, - changes: list[dict[str, Any]], -) -> str: - if changes: - bullets = "\n".join( - f"- **{c.get('description', 'Change applied')}**" - + (f"\n *{c.get('reason', '')}*" if c.get("reason") else "") - for c in changes - ) - return ( - "🤖 **DeepWork Review** applied the following changes:\n\n" - + bullets - ) - # Fallback: diff statistics - added = count_added_lines(diff) - return ( - f"🤖 **DeepWork Review** applied {added} line(s) of changes to this file " - f"based on review findings." - ) - - -# --------------------------------------------------------------------------- -# Main -# --------------------------------------------------------------------------- - -def main() -> None: - pr_number = os.environ.get("PR_NUMBER", "") - repo = os.environ.get("GITHUB_REPOSITORY", "") - # GITHUB_BASE_REF is set automatically by GitHub Actions for pull_request events. - base_ref = os.environ.get("GITHUB_BASE_REF", "main") - - if not pr_number or not repo: - print("PR_NUMBER or GITHUB_REPOSITORY not set; skipping review comments.", file=sys.stderr) - sys.exit(0) - - commit_sha = get_head_sha() - changed_files = get_changed_files(base_ref) - - if not changed_files: - print("No changed files found between base branch and HEAD; nothing to comment on.") - sys.exit(0) - - changes_by_file = load_changes_by_file() - - inline_comments: list[dict[str, Any]] = [] - for file_path in changed_files: - diff = get_diff(file_path, base_ref) - if not diff.strip(): - continue - - line_number = first_changed_line(diff) - normalised = file_path.lstrip("./") - file_changes = changes_by_file.get(normalised, []) or changes_by_file.get(file_path, []) - body = build_comment_body(file_path, diff, file_changes) - - inline_comments.append({ - "path": file_path, - "line": line_number, - "side": "RIGHT", - "body": body, - }) - - if not inline_comments: - print("No inline comments to post.") - sys.exit(0) - - # Build the review payload - review_body = ( - f"🤖 **DeepWork automated review** applied changes to " - f"{len(inline_comments)} file(s).\n\n" - "Review the inline comments below for details on each change." - ) - review_payload: dict[str, Any] = { - "commit_id": commit_sha, - "body": review_body, - "event": "COMMENT", - "comments": inline_comments, - } - - payload_path = Path("/tmp/deepwork_review_payload.json") - payload_path.write_text(json.dumps(review_payload, indent=2)) - - result = run([ - "gh", "api", - f"repos/{repo}/pulls/{pr_number}/reviews", - "--method", "POST", - "--input", str(payload_path), - ]) - - if result.returncode != 0: - print(f"Error posting PR review: {result.stderr}", file=sys.stderr) - # Non-fatal: the changes are already committed; just warn. - sys.exit(0) - - print( - f"Posted PR review with {len(inline_comments)} inline comment(s) " - f"on PR #{pr_number}." - ) - - -if __name__ == "__main__": - main() From b751619f99a889d25a3b5dc6d2090ee57c076aa0 Mon Sep 17 00:00:00 2001 From: Noah Horton Date: Wed, 8 Apr 2026 18:45:53 -0600 Subject: [PATCH 04/10] Split dogfood from example: self-review.yml uses ./ MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The previous self-review loop couldn't test action.yml edits: the only workflow (.github/workflows/example.yml) pinned `uses: Unsupervisedcom/deepwork-action@v1`, which resolves to the v1 tag on main — not the PR branch. Every PR just kept running the old broken base-action composite from main, and committed a fresh output.txt execution log to the PR branch via the old commit step. Split the two roles: - examples/deepwork-review.yml — pristine copy-pasteable reference for external consumers. Pins @v1. Lives outside .github/workflows/ so GitHub doesn't auto-execute it (it's documentation, not CI). - .github/workflows/self-review.yml — this repo's own CI. `uses: ./` so the PR branch's own action.yml is exercised. This is how action.yml edits actually get tested before being tagged. README now links to examples/deepwork-review.yml. CLAUDE.md Repository Layout and Testing Changes sections are updated to explain the split. .deepreview's update_action_surface_docs rule now tracks examples/deepwork-review.yml instead of the old .github path. Co-Authored-By: Claude Opus 4.6 (1M context) --- .deepreview | 6 ++-- .github/workflows/self-review.yml | 36 +++++++++++++++++++ CLAUDE.md | 7 ++-- README.md | 2 +- .../deepwork-review.yml | 0 5 files changed, 45 insertions(+), 6 deletions(-) create mode 100644 .github/workflows/self-review.yml rename .github/workflows/example.yml => examples/deepwork-review.yml (100%) diff --git a/.deepreview b/.deepreview index 3d7de06..d8e3d9e 100644 --- a/.deepreview +++ b/.deepreview @@ -53,7 +53,7 @@ update_action_surface_docs: include: - "action.yml" - "prompts/review.txt" - - ".github/workflows/example.yml" + - "examples/deepwork-review.yml" - "README.md" - "CLAUDE.md" review: @@ -85,7 +85,7 @@ update_action_surface_docs: `bot_name`, `claude_args`), this list is wrong. 3. The "Usage" section's example workflow — must stay aligned with - .github/workflows/example.yml. The concurrency group, the + examples/deepwork-review.yml. The concurrency group, the permissions block (including `id-token: write`), the checkout config, and the action invocation must all match. If example.yml is updated, README's snippet must be updated to match (or vice @@ -136,7 +136,7 @@ update_action_surface_docs: 11. The "Repository layout" bullet list must reflect the actual top-level files and directories. Verify that action.yml, - prompts/review.txt, .github/workflows/example.yml, .deepwork/ + prompts/review.txt, examples/deepwork-review.yml, .deepwork/ (and in particular .deepwork/review/ but NOT .deepwork/tmp/), and .deepreview are all still present and still described correctly. If `scripts/` reappears, or a new top-level directory diff --git a/.github/workflows/self-review.yml b/.github/workflows/self-review.yml new file mode 100644 index 0000000..f675820 --- /dev/null +++ b/.github/workflows/self-review.yml @@ -0,0 +1,36 @@ +name: DeepWork Self-Review + +# This repo dogfoods its own action on every PR. Unlike examples/deepwork-review.yml +# (the copy-pasteable reference for consumers, which pins @v1), this workflow uses +# `uses: ./` so it exercises the action.yml from the PR branch itself — otherwise +# PRs could never test changes to the action before they are tagged as v1.x.y. + +on: + pull_request: + types: [opened, synchronize] + +concurrency: + group: deepwork-self-review-${{ github.event.pull_request.number }} + cancel-in-progress: true + +jobs: + deepwork-review: + runs-on: ubuntu-latest + permissions: + contents: write # push auto-fix commits to the PR branch + pull-requests: write # post inline PR review comments and progress tracker + id-token: write # OIDC for anthropics/claude-code-action + + steps: + - name: Checkout PR branch + uses: actions/checkout@v4 + with: + fetch-depth: 1 + ref: ${{ github.event.pull_request.head.ref }} + token: ${{ secrets.GITHUB_TOKEN }} + + - name: Run DeepWork Review (from PR branch) + uses: ./ + with: + anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} + github_token: ${{ secrets.GITHUB_TOKEN }} diff --git a/CLAUDE.md b/CLAUDE.md index 3b8b9e4..08c3583 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -10,7 +10,8 @@ A **composite GitHub Action** (not a JS/TS/Docker action) that delegates most of - `action.yml` — the composite action definition. Three composite steps: cache restore, load the prompt file into a step output, invoke `anthropics/claude-code-action@v1`. - `prompts/review.txt` — the prompt fed to Claude. Starts with `/review` to trigger the DeepWork plugin's review skill, then enforces CI-mode rules (no `AskUserQuestion`, apply every finding, iterate until clean, post inline comments via `mcp__github_inline_comment__create_inline_comment`). -- `.github/workflows/example.yml` — reference workflow showing how downstream repos consume this action. Not a CI workflow for *this* repo. +- `examples/deepwork-review.yml` — reference workflow showing how downstream repos consume this action (pins `Unsupervisedcom/deepwork-action@v1`). Lives outside `.github/workflows/` so GitHub doesn't auto-execute it — it's documentation, not CI. +- `.github/workflows/self-review.yml` — this repo's own CI. Runs the action against its own PRs using `uses: ./` so the PR branch's `action.yml` is exercised (not the published `v1` tag). Without this split, a PR that edits `action.yml` could never test the edit before it gets tagged. - `.deepwork/` — DeepWork plugin's local state. Only `.deepwork/review/` is source; `.deepwork/tmp/` is the cache directory (gitignored) restored from GitHub Actions cache at runtime. - `.deepreview` — this repo's own review rules, so the action dogfoods itself. @@ -73,4 +74,6 @@ Release automation is planned. Until it lands, the `v1` tag is moved manually on ## Testing changes -There is no local test harness. To validate changes end-to-end you must push a branch and open a PR in a repo that consumes this action (pinning to your branch via `Unsupervisedcom/deepwork-action@`). This repo dogfoods itself via `.github/workflows/example.yml`, so any PR opened against this repo also exercises the action on its own changes. +There is no local test harness. This repo dogfoods itself via `.github/workflows/self-review.yml`, which runs the action against its own PRs using `uses: ./` so edits to `action.yml` are exercised from the PR branch (not the published `v1` tag). Any PR opened against this repo also runs the action on its own changes. + +To validate changes from an *external* consumer's perspective (i.e., the `uses: Unsupervisedcom/deepwork-action@...` path), push a branch and open a PR in a downstream repo pinning to your branch via `Unsupervisedcom/deepwork-action@`. diff --git a/README.md b/README.md index 6c9f180..4322811 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,7 @@ A prebuilt GitHub Action that runs [Claude Code](https://docs.anthropic.com/en/d ## Usage -Create a workflow file such as `.github/workflows/deepwork-review.yml`: +Create a workflow file such as `.github/workflows/deepwork-review.yml` (a copy of [`examples/deepwork-review.yml`](examples/deepwork-review.yml) in this repo): ```yaml name: DeepWork Review diff --git a/.github/workflows/example.yml b/examples/deepwork-review.yml similarity index 100% rename from .github/workflows/example.yml rename to examples/deepwork-review.yml From fcb9163ae1c39c84f7daf7276a838e05e7cf7182 Mon Sep 17 00:00:00 2001 From: Noah Horton Date: Wed, 8 Apr 2026 18:58:25 -0600 Subject: [PATCH 05/10] Unblock autofix: skip permissions, bump max_turns to 100 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit First real run of claude-code-action@v1 from the PR branch got blocked on both permissions and turn budget: - The upstream action configures a read-only allowedTools list for pull_request events (Glob, Grep, LS, Read, git add/commit/push, update_claude_comment, CI tools). No Edit/Write/MultiEdit/Task or mcp__github_inline_comment__create_inline_comment. Claude tried to edit files, hit permission_denials_count: 2, and burned turns fighting denials. Pass --dangerously-skip-permissions via claude_args to bypass the allowlist — this now actually works because the real claude-code-action (unlike the base action) accepts claude_args. - Hit error_max_turns at 51 turns. DeepWork /review dispatches parallel sub-agent Tasks for each rule, and each reviewer eats turn budget. Default max_turns=50 is tight for multi-rule repos. Bump to 100. Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 2 +- action.yml | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 4322811..caca610 100644 --- a/README.md +++ b/README.md @@ -61,7 +61,7 @@ No self-trigger guard is needed: commits pushed by the action via `GITHUB_TOKEN` | `anthropic_api_key` | ✅ | — | Anthropic API key for Claude Code | | `github_token` | ✅ | — | GitHub token with `contents: write` and `pull-requests: write` | | `model` | ❌ | `claude-opus-4-6` | Claude model to use | -| `max_turns` | ❌ | `50` | Maximum agentic turns for Claude Code | +| `max_turns` | ❌ | `100` | Maximum agentic turns for Claude Code | | `commit_message` | ❌ | `chore: apply DeepWork review suggestions` | Commit message for auto-committed changes | ## What Gets Changed diff --git a/action.yml b/action.yml index f121a28..2f613a0 100644 --- a/action.yml +++ b/action.yml @@ -18,9 +18,9 @@ inputs: required: false default: 'claude-opus-4-6' max_turns: - description: 'Maximum number of agentic turns for Claude Code' + description: 'Maximum number of agentic turns for Claude Code. DeepWork /review dispatches parallel sub-agents per rule so each reviewer eats a turn budget; 100 is comfortable for repos with a handful of rules, raise for larger rule suites.' required: false - default: '50' + default: '100' commit_message: description: 'Commit message for auto-committed review changes' required: false @@ -69,3 +69,4 @@ runs: claude_args: | --model ${{ inputs.model }} --max-turns ${{ inputs.max_turns }} + --dangerously-skip-permissions From 12abc1f50f86df39d6cd9b66ec281a74d112ab7f Mon Sep 17 00:00:00 2001 From: "deepwork-action[bot]" <41898282+deepwork-action[bot]@users.noreply.github.com> Date: Thu, 9 Apr 2026 01:06:00 +0000 Subject: [PATCH 06/10] chore: apply DeepWork review suggestions - prompts/review.txt: Add false-positive escape valve to rule 2; add two-cycle convergence limit to rule 3; add rule 5 prohibiting git write commands; add example inline comment format; strengthen commits/pushes section - CLAUDE.md: Document --dangerously-skip-permissions in end-to-end flow; update prompt contract to reflect strengthened rules - .deepwork/review/python_conventions.md: Remove stale references to deleted scripts/post-review-comments.py - .deepreview: Fix stale "change-tracking JSON contract" reference; add tool-unavailable fallback to suggest_new_reviews - .gitignore: Generalize scripts/__pycache__/ to __pycache__/ Co-authored-by: Noah Horton --- .deepreview | 5 ++-- .deepwork/review/python_conventions.md | 33 +++++++++++--------------- .gitignore | 2 +- CLAUDE.md | 7 +++--- prompts/review.txt | 24 +++++++++++++++---- 5 files changed, 41 insertions(+), 30 deletions(-) diff --git a/.deepreview b/.deepreview index d8e3d9e..b80356b 100644 --- a/.deepreview +++ b/.deepreview @@ -45,7 +45,7 @@ prompt_best_practices: Note: prompts/review.txt is the production prompt that this action ships to Claude Code in CI. It is a CRITICAL file for this repo — review it strictly, especially the "Automation Rules" section, the - change-tracking JSON contract, and the no-commit/no-push instructions. + inline-comment posting instructions, and the no-git-write-commands rule. update_action_surface_docs: description: "Keep README.md and CLAUDE.md in sync with action.yml, the production prompt, and the example workflow." @@ -222,7 +222,8 @@ suggest_new_reviews: 1. Call get_configured_reviews to see all currently configured review rules. Also call get_named_schemas to see existing DeepSchemas. - Understand what's already covered. + Understand what's already covered. If these tools are not available, + read the `.deepreview` file(s) directly instead. 2. For each change, consider: - Did this change introduce a type of issue a review rule could catch? - Is there a pattern likely to recur? diff --git a/.deepwork/review/python_conventions.md b/.deepwork/review/python_conventions.md index 43c326c..782709a 100644 --- a/.deepwork/review/python_conventions.md +++ b/.deepwork/review/python_conventions.md @@ -1,8 +1,10 @@ # Python Conventions -Conventions for Python code in this repository, observed from -`scripts/post-review-comments.py`. Keep this short and actionable — it's -a reference for reviewers, not an exhaustive style guide. +Conventions for Python code in this repository. Keep this short and +actionable — it's a reference for reviewers, not an exhaustive style +guide. (Originally derived from `scripts/post-review-comments.py`, which +has since been deleted. The conventions remain valid for any future Python +files added to the repo.) ## Module structure @@ -32,10 +34,8 @@ a reference for reviewers, not an exhaustive style guide. ## Functions and structure -- Prefer small, named top-level functions over inline blocks. The current - script puts each step (`run`, `get_head_sha`, `get_diff`, `first_changed_line`, - `count_added_lines`, `load_changes_by_file`, `build_comment_body`, `main`) - in its own function. +- Prefer small, named top-level functions over inline blocks. Each + logical step should be its own function. - Keep `main()` as the orchestration entry point. Wire it via `if __name__ == "__main__": main()`. - Don't introduce dataclasses or classes unless there's actual state to @@ -48,10 +48,9 @@ a reference for reviewers, not an exhaustive style guide. - Read paths via `pathlib.Path`, not `open(string)`. - Read environment variables with `os.environ.get("NAME", default)` — never raw `os.environ["NAME"]` for inputs that might be missing. -- For non-fatal failures (e.g., the GitHub API call in - `post-review-comments.py`), print a warning to `sys.stderr` and - `sys.exit(0)` so the calling step doesn't fail. Hard-fail with `sys.exit(1)` - only for actual broken state. +- For non-fatal failures, print a warning to `sys.stderr` and + `sys.exit(0)` so the calling CI step doesn't fail. Hard-fail with + `sys.exit(1)` only for actual broken state. ## Strings and formatting @@ -61,19 +60,15 @@ a reference for reviewers, not an exhaustive style guide. ## Error handling - Catch specific exception classes, not bare `except:` or `except Exception:`. - Existing code uses `except (json.JSONDecodeError, OSError) as exc:` — keep - exception lists narrow. + Keep exception lists narrow (e.g., `except (json.JSONDecodeError, OSError) as exc:`). - Print warnings with the exception value: `print(f"Warning: ... {exc}", file=sys.stderr)`. ## Comments and docstrings - Function docstrings are triple-quoted, one-line summaries unless the - function does something subtle (e.g., the diff hunk parser in - `first_changed_line` has a multi-line docstring explaining the format - it parses). -- Inline comments explain *why*, not *what*. The current script has very - few inline comments — that's correct; the function names carry the - meaning. + function does something subtle that justifies a multi-line docstring. +- Inline comments explain *why*, not *what*. Let function names carry the + meaning; keep inline comments sparse. - Section banners (`# ----` blocks) are the exception: they're structural, not explanatory. diff --git a/.gitignore b/.gitignore index ca698a9..29ceae6 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,4 @@ -scripts/__pycache__/ +__pycache__/ # DeepWork review state — restored from GitHub Actions cache at runtime, # regenerated by every /review run. Not source. diff --git a/CLAUDE.md b/CLAUDE.md index 08c3583..e4278a5 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -25,7 +25,7 @@ The action is now thin. In order, `action.yml` runs: - `plugin_marketplaces: https://github.com/Unsupervisedcom/deepwork.git` - `plugins: deepwork@deepwork-plugins` - `prompt:` the review.txt content plus a header with the repo and PR number - - `claude_args: --model --max-turns ` + - `claude_args: --model --max-turns --dangerously-skip-permissions` - `track_progress: true` → live "Claude Code is reviewing..." comment on the PR - `use_commit_signing: false` → Claude uses plain `git commit` / `git push` for auto-fixes - `bot_name: 'deepwork-action[bot]'` @@ -51,9 +51,10 @@ If you ever switch the push path to use a Personal Access Token or a GitHub App `prompts/review.txt` is the production prompt that ships to Claude in CI. Treat it as a critical file — review it strictly whenever it changes. Its essential guarantees: 1. Claude runs `/review` (the DeepWork plugin's skill, not Claude Code's built-in). -2. CI mode rules: never `AskUserQuestion`, apply every finding autonomously, iterate until clean, emit "No review rules configured." and stop if no `.deepreview` rules exist. +2. CI mode rules: never `AskUserQuestion`, apply every finding autonomously (with a false-positive escape valve), iterate until clean or 2 cycles, emit "No review rules configured." and stop if no `.deepreview` rules exist. 3. For each substantive change, post an inline PR comment via `mcp__github_inline_comment__create_inline_comment` with `confirmed: true`, anchored to the changed line, describing what and why. -4. Do not run `git commit` / `git push` — the upstream action handles it. +4. Never run git write commands (`git commit`, `git push`, `git add`, etc.) — the upstream action handles all VCS operations. +5. When findings conflict, prefer correctness over style. If you change the prompt, update the drift checks in `.deepreview`'s `update_action_surface_docs` rule and this CLAUDE.md section to match. diff --git a/prompts/review.txt b/prompts/review.txt index 1db138b..38d69bc 100644 --- a/prompts/review.txt +++ b/prompts/review.txt @@ -11,12 +11,22 @@ There is NO human watching this session. Follow these critical rules at all time 2. **Apply every finding** the review produces — bugs, style, performance, security, documentation, and refactoring. Do not cherry-pick only the "obviously good" ones. When a finding offers multiple valid approaches, - choose the best one yourself. + choose the best one yourself. **Exception:** if a finding is a clear + false positive (applying it would introduce a bug or contradict another + finding), skip it and note why in the inline PR comment. When findings + conflict, prefer correctness over style. 3. **Iterate** — after applying changes, re-run the review until it comes - back clean. + back clean **or you have completed 2 full review-fix cycles**, whichever + comes first. If findings persist after 2 cycles, stop and post a + summary PR comment listing the unresolved findings. 4. **If no `.deepreview` rules are configured** — output the message "No review rules configured." and stop. Do not attempt to configure rules; that is the repository owner's responsibility. +5. **NEVER run git write commands** — do not run `git commit`, `git push`, + `git add`, `git checkout`, `git reset`, or any other git command that + modifies the working tree or history. The wrapping GitHub Action handles + all VCS operations. Read-only git commands (e.g., `git diff`, `git log`) + are fine. ## Posting inline PR comments @@ -32,6 +42,11 @@ One inline comment per distinct change is ideal. Group mechanical repeats (e.g., renaming the same symbol in 10 places) under a single comment on the most representative line. +Example inline comment body: + +> **Changed:** Added explicit iteration cap to the review loop. +> **Rule:** `prompt_best_practices` — unbounded loop without termination condition. + Do NOT post free-form review text as chat messages or top-level PR comments for individual findings — use inline comments so each change is anchored to the code it describes. @@ -39,6 +54,5 @@ is anchored to the code it describes. ## Commits and pushes The wrapping GitHub Action commits and pushes your file changes back to -the PR branch automatically. You do NOT need to run `git commit`, -`git push`, or any `git` command yourself. Just edit the files; the -action takes care of the rest. +the PR branch automatically. Just edit the files; the action takes care +of the rest. See rule 5 above — **never** run git write commands yourself. From a914139f906f83f6792831d3fc198cd139e26122 Mon Sep 17 00:00:00 2001 From: "deepwork-action[bot]" <41898282+deepwork-action[bot]@users.noreply.github.com> Date: Thu, 9 Apr 2026 01:07:19 +0000 Subject: [PATCH 07/10] chore: fix drift check #12 to match 5-guarantee prompt contract The previous commit added a 5th guarantee to CLAUDE.md's prompt contract section but did not update the .deepreview drift check that counts them. Co-authored-by: Noah Horton --- .deepreview | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/.deepreview b/.deepreview index b80356b..8c5ead3 100644 --- a/.deepreview +++ b/.deepreview @@ -142,14 +142,15 @@ update_action_surface_docs: correctly. If `scripts/` reappears, or a new top-level directory is added, this list is incomplete. - 12. The "prompt contract" section enumerates 4 essential guarantees + 12. The "prompt contract" section enumerates 5 essential guarantees about prompts/review.txt. Verify that review.txt still (a) starts with `/review`, (b) enforces CI-mode rules including never using AskUserQuestion, (c) instructs use of `mcp__github_inline_comment__create_inline_comment` with `confirmed: true` for per-change comments, (d) tells Claude not - to run `git commit`/`git push` directly. If review.txt diverges - from any of these, the CLAUDE.md section is wrong. + to run git write commands directly, (e) prefers correctness over + style when findings conflict. If review.txt diverges from any of + these, the CLAUDE.md section is wrong. additional_context: unchanged_matching_files: true From 04ac1dae6dd525faf7c79e0c08be740bea8633b9 Mon Sep 17 00:00:00 2001 From: Noah Horton Date: Wed, 8 Apr 2026 19:34:06 -0600 Subject: [PATCH 08/10] Enable debug output to investigate missing inline comments MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The first successful claude-code-action@v1 run made substantive autofix commits but never invoked mcp__github_inline_comment__create_inline_comment — or if it did, the comments never appeared on the PR. The upstream action's SDK wrapper hides Claude's tool calls by default (show_full_output=false) and buffers/classifies inline comments before posting them (classify_inline_comments=true), so we can't tell which path is at fault from the log alone. Flip both while we diagnose: - show_full_output: true — exposes Claude SDK tool calls in the runner log. WARNING: may expose secrets in tool output; only safe in trusted repos. Revert to false once the inline-comment path is working. - classify_inline_comments: false — post every inline comment Claude creates instead of buffering and classifying real-vs-probe. This rules out the classifier dropping them silently. Co-Authored-By: Claude Opus 4.6 (1M context) --- action.yml | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/action.yml b/action.yml index 2f613a0..badd11e 100644 --- a/action.yml +++ b/action.yml @@ -60,6 +60,13 @@ runs: track_progress: true use_commit_signing: false bot_name: 'deepwork-action[bot]' + # Debug aids while we iterate on getting inline comments working. + # show_full_output exposes Claude SDK tool calls in the runner log + # (including secrets — only safe in trusted repos). + # classify_inline_comments=false posts every inline comment Claude + # creates instead of buffering them and classifying real-vs-probe. + show_full_output: true + classify_inline_comments: false prompt: | REPO: ${{ github.repository }} PR NUMBER: ${{ github.event.pull_request.number }} From ec695dfd455568b29db1248b0c77b0b8ac57731d Mon Sep 17 00:00:00 2001 From: Noah Horton Date: Wed, 8 Apr 2026 20:10:28 -0600 Subject: [PATCH 09/10] Reverse rule 5: Claude commits its own edits via git tools MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The previous prompt rule 5 ("NEVER run git write commands — the wrapping GitHub Action handles all VCS operations") was based on a wrong mental model of anthropics/claude-code-action@v1. That action does NOT auto-commit; it pre-allows a small set of git tools and the upstream system prompt expects Claude to commit and push itself. Our rule 5 told Claude not to, so the most recent run made file edits, reported them as "applied" in the tracking comment, and never persisted them to the branch. Reverse it: Claude MUST commit and push using - git add - git commit -m "" - /home/runner/work/_actions/anthropics/claude-code-action/v1/scripts/git-push.sh origin HEAD Other changes in the same prompt rewrite: - Drop the entire "Posting inline PR comments" section. The upstream claude-code-action@v1 system prompt explicitly forbids creating new PR comments on pull_request events ("Never create new comments. Only update the existing comment using mcp__github_comment__update_claude_comment") — our prompt was fighting it. The mcp__github_inline_comment server isn't loaded in this action's context anyway. Lean into the tracking comment as the sole output surface. - Add a new "Reporting your work" section instructing Claude to use the upstream-managed tracking comment for per-rule findings and commit SHAs. action.yml: revert show_full_output and classify_inline_comments debug flags to upstream defaults. show_full_output=true was leaking secrets into public runner logs. README.md and CLAUDE.md: rewrite "Review Comments"/"How It Works" to describe the tracking-comment-only output surface and Claude's commit responsibility. Add a "Known issues" section to CLAUDE.md documenting the DeepWork plugin MCP server failing to start in the action's runner (plugin installs successfully but MCP server reports status: failed). Also document the PR file restoration security feature. .deepreview: update update_action_surface_docs drift checks #6 and #12 to reflect the new contract (single tracking comment, no inline; Claude commits via git tools, no "never run git" rule). .claude/settings.json: commit the plugin-enabled settings file generated by /plugin so the claude-code-action runner sees enabledPlugins after the file restoration step. Speculative MCP-loading fix; backed up by a research agent dispatched in parallel. Co-Authored-By: Claude Opus 4.6 (1M context) --- .claude/settings.json | 5 ++++ .deepreview | 30 ++++++++++++++-------- CLAUDE.md | 40 ++++++++++++++++++++++++++--- README.md | 8 +++--- action.yml | 7 ----- prompts/review.txt | 59 ++++++++++++++++++++----------------------- 6 files changed, 93 insertions(+), 56 deletions(-) create mode 100644 .claude/settings.json diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 0000000..f0e3d39 --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,5 @@ +{ + "enabledPlugins": { + "deepwork@deepwork-plugins": true + } +} diff --git a/.deepreview b/.deepreview index 8c5ead3..3307c33 100644 --- a/.deepreview +++ b/.deepreview @@ -100,11 +100,13 @@ update_action_surface_docs: PR number. If action.yml's cache step changes its path or key strategy, this section is wrong. - 6. The "Review Comments" section — describes inline comments posted - via the native `mcp__github_inline_comment__create_inline_comment` - MCP tool provided by `anthropics/claude-code-action`. If the - upstream action changes the tool name or if we switch how comments - are posted, this is wrong. + 6. The "Review Comments" section — describes a SINGLE tracking + comment posted via `track_progress: true` (no per-line inline + comments — the upstream `claude-code-action@v1` system prompt + forbids creating new comments on `pull_request` events). If we + ever switch to a different output mechanism, or the upstream + action changes its policy and we start posting inline comments + again, this section is wrong. 7. The "no self-trigger guard needed" claim — relies on GitHub's built-in rule that GITHUB_TOKEN-pushed commits do not retrigger @@ -145,12 +147,18 @@ update_action_surface_docs: 12. The "prompt contract" section enumerates 5 essential guarantees about prompts/review.txt. Verify that review.txt still (a) starts with `/review`, (b) enforces CI-mode rules including - never using AskUserQuestion, (c) instructs use of - `mcp__github_inline_comment__create_inline_comment` with - `confirmed: true` for per-change comments, (d) tells Claude not - to run git write commands directly, (e) prefers correctness over - style when findings conflict. If review.txt diverges from any of - these, the CLAUDE.md section is wrong. + never using AskUserQuestion, (c) tells Claude to commit and + push its own edits using `git add`, `git commit`, and the + upstream action's `git-push.sh` helper script, + (d) directs Claude's output to the upstream tracking comment + via `mcp__github_comment__update_claude_comment` and forbids + creating new PR comments, (e) prefers correctness over style + when findings conflict. If review.txt diverges from any of + these, the CLAUDE.md section is wrong. Also verify that + review.txt does NOT contain any of the now-removed instructions + (the old "NEVER run git write commands" rule, or the old + `mcp__github_inline_comment__create_inline_comment` posting + instructions) — if it does, those are stale leftovers. additional_context: unchanged_matching_files: true diff --git a/CLAUDE.md b/CLAUDE.md index e4278a5..d891e5f 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -31,9 +31,9 @@ The action is now thin. In order, `action.yml` runs: - `bot_name: 'deepwork-action[bot]'` The upstream `claude-code-action` then: -- Installs the DeepWork plugin from the marketplace URL. +- Installs the DeepWork plugin from the marketplace URL (note: the plugin's slash commands install fine but the plugin's MCP server currently fails to connect inside the action's runner — see "Known issues" below). - Spawns Claude Code, which runs `/review`, reads `.deepreview` rules, dispatches reviewers in parallel, applies findings as real file edits. -- Commits and pushes those edits to the PR branch automatically (no custom commit step here). +- Pre-allows a small set of git tools (`git add`, `git commit`, `git-push.sh`, `git rm`) so Claude can commit and push its own edits. **Claude commits and pushes itself; the action does NOT auto-commit.** This was a footgun in an earlier draft of this repo where `prompts/review.txt` told Claude to never run git commands — Claude obediently made edits and then never saved them. The upstream system prompt expects Claude to commit; our prompt now matches. - Posts inline PR comments for each change via the native `mcp__github_inline_comment__create_inline_comment` MCP tool. ## What used to be here and isn't anymore @@ -52,12 +52,44 @@ If you ever switch the push path to use a Personal Access Token or a GitHub App 1. Claude runs `/review` (the DeepWork plugin's skill, not Claude Code's built-in). 2. CI mode rules: never `AskUserQuestion`, apply every finding autonomously (with a false-positive escape valve), iterate until clean or 2 cycles, emit "No review rules configured." and stop if no `.deepreview` rules exist. -3. For each substantive change, post an inline PR comment via `mcp__github_inline_comment__create_inline_comment` with `confirmed: true`, anchored to the changed line, describing what and why. -4. Never run git write commands (`git commit`, `git push`, `git add`, etc.) — the upstream action handles all VCS operations. +3. **Claude MUST commit and push** its file edits itself, using the pre-allowed `git add`, `git commit`, and `/home/runner/work/_actions/anthropics/claude-code-action/v1/scripts/git-push.sh origin HEAD` commands. The wrapping action does not auto-commit. Each iteration cycle should produce its own commit; never amend or force-push. +4. The output surface is the **single tracking comment** managed by the upstream `claude-code-action` via `mcp__github_comment__update_claude_comment`. Claude must not create new PR comments or post free-form chat replies. The upstream action's system prompt explicitly forbids `Never create new comments. Only update the existing comment` — our prompt does not fight this. 5. When findings conflict, prefer correctness over style. If you change the prompt, update the drift checks in `.deepreview`'s `update_action_surface_docs` rule and this CLAUDE.md section to match. +## Known issues + +### DeepWork plugin MCP server fails to start inside `claude-code-action` + +When `anthropics/claude-code-action@v1` installs the DeepWork plugin in the runner, the plugin install reports success (`✓ Successfully installed: deepwork@deepwork-plugins`) **but the plugin's MCP server fails to connect**. The Claude Code session init reports: + +```json +"mcp_servers": [ + { "name": "plugin:deepwork:deepwork", "status": "failed" }, + { "name": "github_comment", "status": "connected" }, + { "name": "github_ci", "status": "connected" } +] +``` + +There is no error message printed near the failure — silent. The plugin's slash commands DO work because `/review` is implemented as a skill file (prompt-style, no MCP needed), so reviews still run, BUT the MCP-provided tools (`get_configured_reviews`, `get_named_schemas`, `start_workflow`, `mark_review_as_passed`, the DeepSchema validation tools, the workflow orchestration tools) are all unavailable in CI. The reviews running today are a **degraded form**: file-edit-based, no quality gates, no DeepSchema validation, no workflow state machine. They produce useful autofixes but skip the structural integrity guarantees the full DeepWork pipeline provides. + +The same plugin works fine outside CI. Leading hypotheses (under investigation): + +1. `claude-code-action`'s `pull_request` security path restores `.claude/`, `.mcp.json`, `.claude.json`, `CLAUDE.md`, etc. from `origin/main` before running Claude — this could be wiping plugin MCP registration that the install step put down. +2. `MCP_TIMEOUT` and `MCP_TOOL_TIMEOUT` env vars are set to empty strings in the runner env — empty values may be interpreted as zero rather than "use default". +3. The plugin's MCP server has a startup dependency (network, filesystem path, env var) that exists in interactive use but not in the runner sandbox. + +If you can fix this upstream in DeepWork or in `claude-code-action`, do — it's the biggest functional gap in the action right now. Until then, the degraded MCP-less review still produces useful output and the tracking comment makes it visible. + +### `pull_request` file restoration + +Before each run, `claude-code-action` restores these files from `origin/main`: `.claude/`, `.mcp.json`, `.claude.json`, `.gitmodules`, `.ripgreprc`, `CLAUDE.md`, `CLAUDE.local.md`, `.husky`. This is a security feature against prompt injection from PR content (`PR head is untrusted` per the runner log). Practical consequences: + +- A PR that *adds* `CLAUDE.md` will run with no `CLAUDE.md` present in the working tree (because `origin/main` has none). The PR's CLAUDE.md is only visible to Claude via direct file reads, not via the auto-loaded context. +- Shipping `.mcp.json` in the repo to wire up MCP servers is pointless — it gets overwritten on every run. Use the `mcp_config:` input on `claude-code-action` instead. +- If you ever add a per-repo Claude Code settings file to this repo's tree, it won't take effect during the action's runs. + ## Versioning the action This action is consumed via `Unsupervisedcom/deepwork-action@v1`. The `v1` tag is a **floating major-version tag** that always points at the latest commit on `main` — standard GitHub Actions convention (see `actions/checkout@v4`, etc.). Consumers pin `@v1` and expect it to track the freshest `1.x.y` automatically. diff --git a/README.md b/README.md index caca610..c60609c 100644 --- a/README.md +++ b/README.md @@ -6,8 +6,8 @@ A prebuilt GitHub Action that runs [Claude Code](https://docs.anthropic.com/en/d 1. **Cache restore** — Restores the DeepWork plugin's per-PR review state from GitHub Actions cache so already-passed reviews are not re-run on subsequent commits. 2. **DeepWork review via Claude Code Action** — Invokes [`anthropics/claude-code-action@v1`](https://github.com/anthropics/claude-code-action) with `plugins: deepwork@deepwork-plugins` and `plugin_marketplaces: https://github.com/Unsupervisedcom/deepwork.git`, then runs the `/review` skill against the PR. The skill reads your `.deepreview` config files, dispatches parallel review agents scoped to exactly the right files, and applies every finding. -3. **Auto-commit** — `claude-code-action` commits Claude's file changes back to the PR branch automatically under the `deepwork-action[bot]` identity. -4. **Inline PR comments** — Claude posts one inline PR comment per substantive change via the native `mcp__github_inline_comment__create_inline_comment` tool. A live progress comment (`track_progress: true`) tracks the review as it runs. +3. **Commit & push** — Claude commits and pushes its file edits to the PR branch using the git tools the upstream action pre-allows. Commits are authored as `deepwork-action[bot]`. +4. **Tracking comment** — `track_progress: true` produces a single live progress comment on the PR with checklisted phases (gather → review → apply → re-run → summary) and a per-rule findings summary including the commit SHAs the fixes landed in. This is the action's only output surface — there are no per-line inline comments (the upstream `claude-code-action@v1` system prompt explicitly forbids creating new comments on `pull_request` events; everything goes through the tracking comment). ## Prerequisites @@ -79,7 +79,9 @@ If no `.deepreview` rules are configured in the repository, the action exits cle ## Review Comments -Each substantive change Claude makes is explained by an inline PR comment anchored to the changed line, posted via the native GitHub inline-comment MCP tool provided by `anthropics/claude-code-action`. Comments appear in the **Files Changed** tab so your team can accept, request modifications, or revert individual changes as needed. +The action posts a **single live tracking comment** on the PR (via `track_progress: true`) showing the review's progress through each phase and a structured per-rule findings summary at the end. The summary lists which findings were applied vs. skipped, with the commit SHAs the fixes landed in, so your team can review the resulting commits in the **Files Changed** tab and accept, request modifications, or revert individual changes as needed. + +There are no per-line inline review comments. The upstream `anthropics/claude-code-action@v1` system prompt explicitly forbids creating new comments on `pull_request` events for safety; all output flows through the tracking comment instead. ## Caching diff --git a/action.yml b/action.yml index badd11e..2f613a0 100644 --- a/action.yml +++ b/action.yml @@ -60,13 +60,6 @@ runs: track_progress: true use_commit_signing: false bot_name: 'deepwork-action[bot]' - # Debug aids while we iterate on getting inline comments working. - # show_full_output exposes Claude SDK tool calls in the runner log - # (including secrets — only safe in trusted repos). - # classify_inline_comments=false posts every inline comment Claude - # creates instead of buffering them and classifying real-vs-probe. - show_full_output: true - classify_inline_comments: false prompt: | REPO: ${{ github.repository }} PR NUMBER: ${{ github.event.pull_request.number }} diff --git a/prompts/review.txt b/prompts/review.txt index 38d69bc..7832fea 100644 --- a/prompts/review.txt +++ b/prompts/review.txt @@ -13,46 +13,43 @@ There is NO human watching this session. Follow these critical rules at all time "obviously good" ones. When a finding offers multiple valid approaches, choose the best one yourself. **Exception:** if a finding is a clear false positive (applying it would introduce a bug or contradict another - finding), skip it and note why in the inline PR comment. When findings - conflict, prefer correctness over style. + finding), skip it and note why. When findings conflict, prefer + correctness over style. 3. **Iterate** — after applying changes, re-run the review until it comes back clean **or you have completed 2 full review-fix cycles**, whichever - comes first. If findings persist after 2 cycles, stop and post a - summary PR comment listing the unresolved findings. + comes first. If findings persist after 2 cycles, stop and note the + unresolved findings in your final tracking-comment summary. 4. **If no `.deepreview` rules are configured** — output the message "No review rules configured." and stop. Do not attempt to configure rules; that is the repository owner's responsibility. -5. **NEVER run git write commands** — do not run `git commit`, `git push`, - `git add`, `git checkout`, `git reset`, or any other git command that - modifies the working tree or history. The wrapping GitHub Action handles - all VCS operations. Read-only git commands (e.g., `git diff`, `git log`) - are fine. -## Posting inline PR comments +## Committing and pushing your changes -For every substantive change you make, post an inline PR comment via the -`mcp__github_inline_comment__create_inline_comment` MCP tool, with -`confirmed: true`, anchored to the most relevant line of the new file -content. The comment body should describe: +The wrapping `anthropics/claude-code-action@v1` does NOT auto-commit. You +MUST commit and push your file edits yourself, using the git tools the +upstream action has pre-allowed: -- **What** you changed (one sentence) -- **Why** — the specific .deepreview finding or rule that prompted it +- Stage your changes: `git add ` +- Commit with the message provided in the prompt header + (`COMMIT MESSAGE FOR AUTO-FIXES`): + `git commit -m ""` +- Push to the PR branch using the upstream action's helper script: + `/home/runner/work/_actions/anthropics/claude-code-action/v1/scripts/git-push.sh origin HEAD` -One inline comment per distinct change is ideal. Group mechanical repeats -(e.g., renaming the same symbol in 10 places) under a single comment on -the most representative line. +If you make multiple cycles of edits (per rule 3), commit each cycle as +its own commit so the history reflects what you did. Do not amend prior +commits and do not force-push. -Example inline comment body: +## Reporting your work -> **Changed:** Added explicit iteration cap to the review loop. -> **Rule:** `prompt_best_practices` — unbounded loop without termination condition. +Your output surface is the **single tracking comment** the upstream action +manages via `mcp__github_comment__update_claude_comment`. The action's own +system prompt instructs you to update this comment with your progress and +findings — follow that guidance. Do NOT create new PR comments or post +free-form review text as chat messages; the user will not see them. -Do NOT post free-form review text as chat messages or top-level PR -comments for individual findings — use inline comments so each change -is anchored to the code it describes. - -## Commits and pushes - -The wrapping GitHub Action commits and pushes your file changes back to -the PR branch automatically. Just edit the files; the action takes care -of the rest. See rule 5 above — **never** run git write commands yourself. +In your tracking-comment summary, include for each rule: +- Which `.deepreview` rule was evaluated +- The findings (count by severity) +- Which findings were applied vs. skipped (and why, for skipped ones) +- The commit SHA(s) the fixes landed in From a845068897f1ec9e3071d03e981b65edbe65dcdb Mon Sep 17 00:00:00 2001 From: Noah Horton Date: Wed, 8 Apr 2026 20:17:59 -0600 Subject: [PATCH 10/10] Document MCP-load failure investigation and park PR as draft MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Capture the full investigation of the DeepWork plugin MCP server failing to start inside anthropics/claude-code-action@v1, including: - Symptom (plugin install reports success; mcp_servers init payload reports plugin:deepwork:deepwork status: failed; silent, no error message; same plugin works fine outside CI). - Why slash commands still work (/review is a skill file, no MCP needed) vs. what's missing (get_configured_reviews, mark_review_as_passed, start_workflow, DeepSchema validation, the workflow state machine). - Root-cause hypotheses ranked by probability: 1. 70% — PR file restoration wipes plugin MCP registration 2. 60% — No automatic plugin → session MCP merge path 3. 30% — MCP_TIMEOUT/MCP_TOOL_TIMEOUT empty env vars - Three open upstream issues that match our exact symptoms: - anthropics/claude-code-action#813 (silent MCP failures) - anthropics/claude-code-action#1004 (--mcp-config silently dropped) - anthropics/claude-code-action#95 (no plugin → session MCP merge path) - Definitive diagnostic experiment to confirm root cause #1. - Speculative fix logic for the .claude/settings.json file added in the rule-5 reversal commit (enables plugin at project scope; only effective for PRs opened after the file lands on main, because PR file restoration pulls from origin/main). - BLOCKING status: PR parked as draft until upstream fixes land. Co-Authored-By: Claude Opus 4.6 (1M context) --- CLAUDE.md | 33 ++++++++++++++++++++++++++------- 1 file changed, 26 insertions(+), 7 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index d891e5f..e55baa7 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -62,7 +62,9 @@ If you change the prompt, update the drift checks in `.deepreview`'s `update_act ### DeepWork plugin MCP server fails to start inside `claude-code-action` -When `anthropics/claude-code-action@v1` installs the DeepWork plugin in the runner, the plugin install reports success (`✓ Successfully installed: deepwork@deepwork-plugins`) **but the plugin's MCP server fails to connect**. The Claude Code session init reports: +**Status: BLOCKING. The action is in WIP / draft state pending upstream fixes.** Reviews still produce useful autofixes via the slash-command path, but the MCP-backed quality gates, DeepSchema validation, and workflow orchestration are all unavailable in CI. Until upstream fixes land in `anthropics/claude-code-action`, this repo's PR is parked as draft. + +**Symptom.** When `anthropics/claude-code-action@v1` installs the DeepWork plugin in the runner, the plugin install reports success (`✓ Successfully installed: deepwork@deepwork-plugins (scope: user)`) **but the plugin's MCP server reports `status: failed` in the Claude Code session init payload**: ```json "mcp_servers": [ @@ -72,15 +74,32 @@ When `anthropics/claude-code-action@v1` installs the DeepWork plugin in the runn ] ``` -There is no error message printed near the failure — silent. The plugin's slash commands DO work because `/review` is implemented as a skill file (prompt-style, no MCP needed), so reviews still run, BUT the MCP-provided tools (`get_configured_reviews`, `get_named_schemas`, `start_workflow`, `mark_review_as_passed`, the DeepSchema validation tools, the workflow orchestration tools) are all unavailable in CI. The reviews running today are a **degraded form**: file-edit-based, no quality gates, no DeepSchema validation, no workflow state machine. They produce useful autofixes but skip the structural integrity guarantees the full DeepWork pipeline provides. +Silent failure — no stack trace, no `command not found`, no `could not start`. The two GitHub MCPs that `claude-code-action` ships natively connect fine; only the plugin's MCP fails. The same plugin works fine outside CI. + +**Why slash commands still work.** `/review` is implemented as a skill file (prompt-style, no MCP needed), so the review *runs* and produces autofix file edits. What's missing in CI is everything that depends on actual MCP tool calls: `mcp__plugin_deepwork_deepwork__get_configured_reviews`, `get_named_schemas`, `start_workflow`, `mark_review_as_passed`, `register_session_job`, the DeepSchema validation tools, the workflow state machine. The CI reviews are a **degraded form**: file-edit-based, no quality gates, no DeepSchema validation, no per-step quality reviewers. + +**Root-cause hypotheses (ranked by probability, from research):** + +1. **70% — PR file restoration wipes plugin MCP registration.** `claude-code-action`'s `pull_request` security path restores `.claude/`, `.mcp.json`, `.claude.json`, `.gitmodules`, `.ripgreprc`, `CLAUDE.md`, `CLAUDE.local.md`, `.husky` from `origin/main` before Claude initializes. The plugin install step runs *first* and writes its MCP registry entry to `~/.claude/.mcp.json` (or similar); the restoration step then erases that registry. The init payload reports `status: failed` because the entry is gone, not because the MCP itself crashed. Silent because there's no entry to report an error against. +2. **60% — No automatic plugin → session MCP merge path.** Plugin packages are installed but their MCP server definitions aren't automatically merged into the MCP config the Claude CLI subprocess reads. The two `github_*` MCPs work because they're built into `claude-code-action` and explicitly wired up; third-party plugin MCPs have no documented integration path in CI. ([anthropics/claude-code-action#95](https://github.com/anthropics/claude-code-action/issues/95)) +3. **30% — `MCP_TIMEOUT`/`MCP_TOOL_TIMEOUT` empty env vars.** The runner env sets both to empty strings rather than to a numeric value or leaving them unset. Empty might be interpreted as `0` (instant timeout) rather than "use default". + +**Open upstream issues that match our exact symptoms:** + +- [anthropics/claude-code-action#813 — "Connection to MCPs fails without any logs"](https://github.com/anthropics/claude-code-action/issues/813) — silent MCP failures with zero error output even in debug mode. Same symptom. +- [anthropics/claude-code-action#1004 — "--mcp-config file path in claude_args is silently dropped when action's inline JSON config is present"](https://github.com/anthropics/claude-code-action/issues/1004) — confirms the action has bugs that silently discard MCP config under specific conditions. +- [anthropics/claude-code-action#95 — "Add mcp_config input that merges with existing mcp server"](https://github.com/anthropics/claude-code-action/issues/95) — feature request acknowledging there's no automated mechanism to merge plugin MCPs into the session config. + +**Definitive diagnostic experiment** (not yet run): in a debug PR, add a composite step *between* the plugin install and the `claude-code-action` invocation that dumps `~/.claude/.mcp.json`. Then, modify the action invocation (or wrap it) so a second dump fires immediately *after* `claude-code-action`'s file-restoration step but *before* Claude initializes. If the two JSONs differ — plugin entry present in dump 1, absent in dump 2 — file restoration is confirmed as the culprit. (Doing this from outside `claude-code-action` is awkward because the file restoration happens inside the action. May need a fork of the action or an `actions/cache@v4`-style trick to capture state between sub-steps.) -The same plugin works fine outside CI. Leading hypotheses (under investigation): +**Speculative pre-positioned fix in this repo: `.claude/settings.json`.** Committed as part of the rule-5 reversal commit, with `enabledPlugins.deepwork@deepwork-plugins: true`. The theory: if the plugin is installed at user scope but not enabled at the project level, Claude won't load its MCP for the session. With the settings file present, Claude *should* try to start the MCP. **However**: this fix can only take effect on PRs opened AFTER the file lands on `main`, because the PR-restoration step on the current PR will pull `.claude/` from `origin/main` (which doesn't yet have the file). On the in-flight PR the speculative fix is wiped before Claude sees it. -1. `claude-code-action`'s `pull_request` security path restores `.claude/`, `.mcp.json`, `.claude.json`, `CLAUDE.md`, etc. from `origin/main` before running Claude — this could be wiping plugin MCP registration that the install step put down. -2. `MCP_TIMEOUT` and `MCP_TOOL_TIMEOUT` env vars are set to empty strings in the runner env — empty values may be interpreted as zero rather than "use default". -3. The plugin's MCP server has a startup dependency (network, filesystem path, env var) that exists in interactive use but not in the runner sandbox. +**Until upstream fixes land**, the action is unusable for its intended purpose (full DeepWork review with quality gates). Either: +- Wait for fixes to one of #813 / #1004 / #95 in `anthropics/claude-code-action`, AND/OR +- File a follow-up issue against the DeepWork plugin describing the CI install pattern that fails, AND/OR +- Implement the diagnostic experiment above and use the result to drive an upstream PR. -If you can fix this upstream in DeepWork or in `claude-code-action`, do — it's the biggest functional gap in the action right now. Until then, the degraded MCP-less review still produces useful output and the tracking comment makes it visible. +Don't merge this PR until at least the basic MCP-loads-in-CI path works end-to-end. The PR has been put back into draft state pending that fix. ### `pull_request` file restoration