ci: add AI-powered issue triage workflow by samyakLambda · Pull Request #14 · LambdaTest/kane-cli

samyakLambda · 2026-04-30T09:22:47Z

Summary

Adds a GitHub Actions workflow that uses Anthropic's Claude API to automatically classify every new and edited issue by priority (P0–P3) and area (cli, config, auth, execution, reporting, tms, docs), apply the matching labels, and post a single bot comment with a root-cause hypothesis and suggested next steps.

The bot comment is clearly marked as automated and not human-reviewed; maintainers can relabel, edit, or delete at any time. Adding the no-triage label opts an issue out of the workflow.

What's included

.github/workflows/issue-triage.yml — runs on issues: opened, reopened, edited. Skips bot authors and no-triage-labeled issues. Per-issue concurrency, 5-minute timeout, issues: write + contents: read only.
.github/scripts/triage.mjs — calls Claude API, validates output against a closed taxonomy (drops anything off-list), posts/updates a single bot comment keyed by , reconciles labels on re-runs. Uses prompt caching for the system prompt.
.github/scripts/labels.json + bootstrap-labels.mjs + .github/workflows/bootstrap-labels.yml — one-shot workflow_dispatch label provisioning. Idempotent (creates or patches color/description). Run this once after merge.
.github/AI_TRIAGE.md — maintainer-facing doc: setup, opt-out, model override, how to report bot mistakes.

Setup required after merge

Add an ANTHROPIC_API_KEY repository secret (Settings → Secrets and variables → Actions).
(Optional) Add an ANTHROPIC_TRIAGE_MODEL repository variable to override the default model. Default: claude-haiku-4-5-20251001 (fast, ~fractions of a cent per issue). Set to e.g. claude-sonnet-4-6 for higher-quality RCA.
Run Actions → Bootstrap triage labels → Run workflow to provision priority/P0..P3, area/cli|config|auth|execution|reporting|tms|docs, and no-triage.

Design choices

Existing labels untouched. The 10 default labels (bug, enhancement, question, etc.) are kept; the workflow only adds new priority/* and area/* namespaces, and infers an issue type for the comment without auto-applying a bug/enhancement label (humans still do that).
Idempotent on edits. Editing the issue body re-runs triage and updates the existing comment in place rather than spamming new ones. Labels are reconciled (old priority/* and area/* removed, new ones added).
Prompt-injection hardened. System prompt instructs the model to treat issue text as untrusted; output is JSON-only and is validated against a closed enum (any off-list value is dropped).
Cost-bounded. Haiku model + ephemeral prompt cache. Bot is skipped for issues authored by other bots.

Test plan

Maintainer adds ANTHROPIC_API_KEY secret.
Maintainer triggers Bootstrap triage labels workflow once and confirms 12 labels are created.
Open a test bug-style issue; verify priority/area labels are applied and a single triage comment is posted.
Edit the test issue body; verify the same comment is updated (not duplicated) and labels reconcile.
Add no-triage to a fresh issue; verify the workflow skips it.
(Optional) Override ANTHROPIC_TRIAGE_MODEL to a Sonnet model and confirm the comment footer reflects the new model.

🤖 Generated with Claude Code

Adds a GitHub Actions workflow that uses Anthropic's Claude API to automatically classify new and edited issues by priority and area, post a single bot comment with a root-cause hypothesis and suggested next steps, and apply the matching labels. - .github/workflows/issue-triage.yml: runs on issue open/reopen/edit, skips bots and issues labeled `no-triage`, per-issue concurrency, 5-minute timeout. - .github/scripts/triage.mjs: idempotent (single bot comment keyed by HTML marker, label reconciliation on re-runs), strict output validation against a closed taxonomy, prompt-cache enabled. - .github/scripts/labels.json + bootstrap-labels.mjs + workflow_dispatch workflow: one-shot label provisioning for priority/P0..P3, area/{cli,config,auth,execution,reporting,tms,docs}, and no-triage opt-out. - .github/AI_TRIAGE.md: maintainer-facing doc covering setup (ANTHROPIC_API_KEY secret + optional ANTHROPIC_TRIAGE_MODEL var), opt-out, and how to report bot mistakes. Default model is claude-haiku-4-5-20251001 (cheap, fast); override with the ANTHROPIC_TRIAGE_MODEL repo variable. Comments are clearly marked as automated and not human-reviewed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Issue-triage workflow only fires on opened/reopened/edited, so already-open issues are not triaged automatically. This adds a manual backfill that iterates issues and runs the same triage logic. - .github/scripts/lib.mjs: extract shared triageIssue() — Anthropic call, output sanitization, idempotent comment, label reconciliation. Both per-issue and backfill entry points consume it. - .github/scripts/triage.mjs: thin wrapper — reads env, fetches issue labels, delegates to lib. - .github/scripts/backfill.mjs: lists issues (skips PRs returned by the issues endpoint), filters by no-triage / already-priorityed, triages sequentially, prints a summary. Honors dry-run. - .github/workflows/triage-backfill.yml: workflow_dispatch with inputs (state, only_unlabeled, limit 1-500, dry_run). 30-minute timeout; backfill concurrency group prevents overlapping runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

samyakLambda · 2026-04-30T12:54:19Z

Pushed two follow-up commits to address backfilling existing issues:

ci: add backfill workflow for existing issues — the per-issue workflow only fires on opened/reopened/edited, so already-open issues wouldn't be triaged. This adds:
- .github/scripts/lib.mjs — extracts shared triageIssue() (Anthropic call, sanitization, idempotent comment, label reconciliation). Both triage.mjs and the new backfill consume it.
- .github/scripts/backfill.mjs — lists issues (skipping PRs), filters by no-triage / already-priority/*, triages sequentially, prints a summary.
- .github/workflows/triage-backfill.yml — workflow_dispatch with inputs: state (open/closed/all), only_unlabeled (default true), limit (1–500), dry_run. 30-min timeout, dedicated concurrency group.
docs: document backfill workflow inputs and usage — updated AI_TRIAGE.md with the backfill section.

The headline change: assistive, not authoritative

Codex flagged the original design as a no-ship: the bot deleted human-applied priority/* and area/* labels on every re-triage. Fixed by making label ownership human-first:

After first triage, the bot never overwrites human-applied priority/* or area/* labels on subsequent runs. The comment body still updates with the latest RCA.
A new force_relabel workflow input (default false) on the backfill workflow opts back into the destructive behavior with a logged warning.
triage-failed and needs-info markers are still cleared by the bot — those are bot-owned signals.

Other reviewer findings, all addressed

Security

Bot comment lookup checks both marker AND author (github-actions[bot]) — attacker can't redirect bot PATCH by forging the marker.
Actions SHA-pinned (checkout v4.3.1, setup-node v4.4.0).
Workflow root permissions: {}, per-job grants only.
Model output wrapped in ::stop-commands::<token> to neuter prompt-injected workflow commands.
Title/body clamped to 500/8000 chars.
edited trigger skipped when neither title nor body changed (label-only edits no longer burn API calls).

Correctness — schema failure now loud

Invalid model output throws SchemaError, applies triage-failed label, posts a clearly-marked failure comment, and exits the workflow non-zero. No more silent fallback to priority/P2 — 50% confidence.
Labels reconciled first, comment second (was reversed — bot was lying about labels on partial failure).
Label DELETE only swallows 404; other statuses logged.
triage.mjs reads from GITHUB_EVENT_PATH (no env-var fan-out, no extra GET).
Missing ANTHROPIC_API_KEY now exits 1 (was: silently green).
Anthropic 401/403 fatal → backfill aborts. 429/529 retry once with retry-after.
JSON parsing uses assistant-prefill { + balanced-brace scanner (was: greedy regex).
All fetch calls have 60s timeouts.

Quality

Empty/short issues short-circuit to needs-info (no model call, no hallucinated RCA).
next_steps stripped of leading markdown sigils (#/>/|/-) to prevent comment-body breakouts.
DEFAULT_MODEL centralised in lib.mjs.
Shared gh() reused by bootstrap-labels.mjs; per-label summary at the end.
New labels in labels.json: triage-failed, needs-info.
AI_TRIAGE.md rewritten — covers label-ownership, force_relabel, failure path, security hardening.

Diff for this commit: 9 files changed, +699 / −258. Total PR: +1138 / −0.

The PR is now ready for human review. After merge:

Add ANTHROPIC_API_KEY repo secret.
Run Bootstrap triage labels (creates 14 labels including triage-failed and needs-info).
Run AI Issue Triage (Backfill) with dry_run: true first.

`applyMarkerLabel` previously deleted every managed label except the new marker, including `priority/*` and `area/*`. That meant a reporter edit which downgraded an already-triaged issue to needs-info or schema-failed state would erase a maintainer's earlier label correction — exactly the sticky-human-label invariant this branch is supposed to enforce. Fix: limit `applyMarkerLabel` to toggling the two flag labels (`triage-failed` ↔ `needs-info`). priority/* and area/* are now left untouched on the failure and needs-info paths. Found by codex adversarial-review (2026-04-30). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

samyakLambda · 2026-04-30T13:54:35Z

Pushed 3cb2c6e to fix the label-loss path Codex flagged in the second adversarial pass.

What was broken

applyMarkerLabel (lib.mjs) previously deleted every managed label except the new marker — including priority/* and area/*. That meant: bot triages → maintainer corrects priority → reporter strips the body → workflow enters needs-info branch → maintainer's priority/* silently deleted. Same hazard on schema-fail.

This contradicted the explicit "human labels are sticky" contract added in the previous commit.

Fix

applyMarkerLabel now only toggles the two flag labels (triage-failed ↔ needs-info). priority/* and area/* are never touched on failure / needs-info paths.

13 lines changed.

Two findings from codex adversarial-review on the previous commit: [high] Bot couldn't refresh its own labels. After first triage, the bot-applied `priority/*` looked identical to a human-applied one, so `applyTriageLabels` treated it as human-managed and refused to update on later edits. Reporter could add critical repro details that bumped priority, comment body would refresh with the new classification, but the label stayed at the old value. The "human-sticky" contract degraded into "everything is sticky after first triage." [medium] Label decisions used a stale event snapshot. Workflow read labels at job start, then mutated 1-3s later after the Anthropic call returned. A maintainer relabeling during that window was invisible to the mutation logic. Fixes: 1. Embed the bot's last-applied set in the marker comment as ``. On re-runs, parse it and compare against the freshly-fetched managed labels: - sets equal => bot owns them, free to refresh. - sets differ => human (or another bot) touched, preserve. Failure and needs-info comments carry the previous applied set forward so a follow-up success can still recognise its own labels. 2. New `loadFreshState()` re-fetches issue labels and the existing bot comment in parallel right before any label/comment mutation. The `issue.labels` field passed in from the event payload is now used ONLY for the early `no-triage` skip check; mutation logic always uses fresh state. 3. The marker is updated to match what the bot just wrote — NOT what's currently on the issue. On preservation the marker stays frozen at the bot's previous applied set, so the next run still detects "human-touched" rather than claiming the human's labels as its own. 4. AI_TRIAGE.md rewritten to document the new ownership model. Found by codex adversarial-review (2026-04-30, third pass). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

samyakLambda · 2026-04-30T14:09:56Z

Pushed 70652d1 to fix both findings from the third adversarial pass.

[high] Bot couldn't refresh its own labels — fixed

Previously the bot's own priority/P3 looked identical to a human-applied one, so after first triage applyTriageLabels treated all priority/* as human-owned and refused to refresh. A reporter adding repro details that should bump priority would update the comment body but leave the label stale.

Fix: bot now embeds its last-applied set in the marker comment as . On re-runs, the bot parses this and compares against the freshly-fetched managed labels:

sets equal → bot owns them, free to refresh.
sets differ → human touched, preserve.

The marker stays frozen at the bot's previous set on preservation, so the next run still detects the human's correction as not-its-own.

Failure and needs-info comments carry the previous applied set forward so a transient model-output failure doesn't make the bot forget what it last applied.

[medium] Stale event snapshot — fixed

New loadFreshState() does parallel GET /issues/N + GET /issues/N/comments right before any label/comment mutation. The issue.labels field from the event payload is now used only for the early no-triage skip check; mutation logic always uses fresh state. Closes the TOCTOU window between event delivery and write.

Doc

AI_TRIAGE.md rewritten to document the new ownership model — concrete description of what the bot will and will not refresh, what the marker means, and when the marker stays frozen.

Diff: 2 files, +187 / −65 in this commit.

Behavior matrix after this commit

Initial state	Reporter edits	Bot's response
Fresh issue	(n/a)	Apply priority + areas, write marker
Bot-triaged, no human change	Body edited	Refresh labels if model output changed
Bot-triaged, maintainer corrected priority	Body edited	Preserve maintainer's priority. Comment still gets new RCA. Marker stays frozen.
Triage-failed	Body edited (fixes the issue)	Clear `triage-failed`, apply new labels, restore success comment
Backfill `force_relabel: true`	(n/a)	Overwrite human-applied labels (logged as warning)

Addresses the must-fix and high-leverage findings from the final combined review (code-reviewer + silent-failure-hunter + codex adversarial). Net: failure paths can't silently violate the contract, and the bot is no longer one typo or schema drift away from quietly mis-classifying issues. Must-fix: 1. validate() requires 1-3 areas, all in the closed enum, instead of silently filtering invalid values to an empty array. An empty/all- invalid `areas` field now throws SchemaError just like a bad priority — the bot no longer happily emits priority/Pn with no area/* routing label. 2. Failure paths re-check no-triage after loadFreshState. A maintainer adding `no-triage` while Anthropic was responding to a malformed prompt was previously visible only on the success path; now both schema-fail branches (and applyTriageLabels GhError) honour it. 3. applyTriageLabels GhError is routed through the same triage-failed recovery as SchemaError. Previously, a missing label in the repo (bootstrap workflow not run, maintainer renamed a label) would leave the issue partially mutated with no marker and a confusing red workflow run. 4. Anthropic 4xx (other than 429) is now fatal. A typo'd ANTHROPIC_TRIAGE_MODEL var would previously make the model API return 404 model_not_found, classed non-fatal, and a backfill of 100 issues would burn 100 calls all returning the same 404. Now the first 4xx aborts the loop. 5. loadFreshState detects issue-deleted (404/410) via a new IssueGoneError. triageIssue catches and exits cleanly with action `skipped:issue-gone` instead of spraying writes at a dead resource (which previously surfaced as confusing logs from the sequence of safeRemoveLabel 404s followed by an addLabels 404). High-leverage cheap fixes: 6. loadFreshState uses Promise.allSettled so failure context names the specific arm (issue GET vs comment list) instead of dropping one to the floor. 7. validate() strips HTML comments from summary/rca/next_steps so a model-emitted `` cannot win the marker regex on a future template change. 8. APPLIED_MARKER_RE is anchored to the start of the comment body (`^\r?\n`) — defence-in-depth alongside #7. Verified by smoke test that an attacker-prefix injection is correctly rejected. 10. postOrUpdateComment falls back from PATCH 404 to POST so a maintainer deleting the bot's comment mid-run results in a fresh comment, not a workflow failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

samyakLambda · 2026-04-30T14:25:18Z

Pushed 9fa8dc0 to land items 1–5 (must-fix) plus 6, 7, 8, 10 (cheap, high-leverage) from the triple-reviewer punch list.

Must-fix

validate() requires 1–3 areas in the closed enum. Empty / all-invalid areas → SchemaError. No more silent priority/Pn with no area routing label. (Codex medium)
Failure paths re-check no-triage. Maintainer opt-out during the Anthropic round-trip is now honoured on schema-fail and label-mutation-fail paths, not just success. (Codex high, silent-failure M1)
applyTriageLabels errors → triage-failed path. A missing repo label (bootstrap not run, maintainer rename) no longer leaves the issue partially mutated with no marker. Same recovery as schema errors. (silent-failure H2, code-reviewer fix: install kane-cli via npm tarball instead of bare runner binary #1)
Anthropic 4xx (≠ 429) is fatal. Typo'd ANTHROPIC_TRIAGE_MODEL returning 404 model_not_found now aborts the backfill on the first issue instead of burning 100 calls. (silent-failure H3)
Issue-deleted is a typed IssueGoneError. loadFreshState requests [200, 404, 410] and surfaces deletion cleanly; triageIssue exits with skipped:issue-gone instead of spraying writes at a dead resource. (silent-failure C1)

High-leverage

Promise.allSettled in loadFreshState. Errors now name which arm (issue GET vs comment list) failed. (silent-failure C2)
HTML-comment strip in validate(). Model-emitted  in rca/summary/next_steps is removed before it reaches the comment body. (silent-failure H1)
APPLIED_MARKER_RE anchored to start of comment body. Smoke test verified: attacker-prefix injection rejected; genuine marker still parses. (silent-failure H1)
postOrUpdateComment PATCH 404 → POST. Maintainer deleting the bot's comment mid-run gracefully creates a fresh one. (code-reviewer URL Link for Docs & Bugs section is incorrect in CLI Help section #4)

160 insertions / 47 deletions in lib.mjs. Items 9, 11–15 (nits and edge-case docs) deferred to a follow-up unless reviewers flag them as blocking.

samyakLambda and others added 3 commits April 30, 2026 14:51

docs: document backfill workflow inputs and usage

5248012

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add AI-powered issue triage workflow#14

ci: add AI-powered issue triage workflow#14
samyakLambda wants to merge 7 commits intomainfrom
feat/ai-issue-triage

samyakLambda commented Apr 30, 2026

Uh oh!

samyakLambda commented Apr 30, 2026

Uh oh!

samyakLambda commented Apr 30, 2026

Uh oh!

samyakLambda commented Apr 30, 2026

Uh oh!

samyakLambda commented Apr 30, 2026

Uh oh!

samyakLambda commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

samyakLambda commented Apr 30, 2026

Summary

What's included

Setup required after merge

Design choices

Test plan

Uh oh!

samyakLambda commented Apr 30, 2026

Recommended one-time post-merge sequence

Uh oh!

samyakLambda commented Apr 30, 2026

The headline change: assistive, not authoritative

Other reviewer findings, all addressed

Uh oh!

samyakLambda commented Apr 30, 2026

What was broken

Fix

Uh oh!

samyakLambda commented Apr 30, 2026

[high] Bot couldn't refresh its own labels — fixed

[medium] Stale event snapshot — fixed

Doc

Behavior matrix after this commit

Uh oh!

samyakLambda commented Apr 30, 2026

Must-fix

High-leverage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant