Skip to content

Consolidate AI review CI quality gate#3550

Open
masih wants to merge 20 commits into
mainfrom
masih/consolidate-ai-reviewers
Open

Consolidate AI review CI quality gate#3550
masih wants to merge 20 commits into
mainfrom
masih/consolidate-ai-reviewers

Conversation

@masih

@masih masih commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Consolidate the AI reviewers into a single required CI job with workaround for merge queue.

masih added 2 commits June 4, 2026 19:16
Consolidate the AI reviewers into a single required CI job with
workaround for merge queue.
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJun 8, 2026, 2:20 PM

@masih masih marked this pull request as ready for review June 4, 2026 17:24
@cursor

cursor Bot commented Jun 4, 2026

Copy link
Copy Markdown

PR Summary

High Risk
The ECR workflow YAML is broken and would fail CI parsing on main/release pushes; the new gate also affects merge eligibility via commit statuses and App credentials.

Overview
Adds a new AI Review GitHub Actions workflow that publishes a single required commit status by aggregating Cursor Bugbot and Claude Code Review check runs on the PR head SHA. It runs on pull_request (non-draft), filtered check_run events, and merge_group; uses a GitHub App token, per-SHA concurrency, and maps conclusions to failure (blocking), success (all clear or advisory only), or pending (still running). Merge queue commits get an automatic success so the gate does not wait for reviewers that do not re-run there.

Also changes .github/workflows/ecr.yml by inserting the line BREAK AND SEE IF AI CATCHES THIS!!!! immediately after the workflow name, which invalidates the YAML and would break the ECR workflow until removed.

Reviewed by Cursor Bugbot for commit 74dd752. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread .github/workflows/ai-review.yml Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c19330e836

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/ai-review.yml Outdated
with:
script: |
const sha = context.payload.pull_request.head.sha
const deadline = Date.now() + 8 * 60 * 1000

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Allow enough time for Claude reviews

Claude Code's Code Review docs state that reviews complete in 20 minutes on average, so this 8-minute deadline (with a 10-minute job timeout) will make normal PRs fail this required gate while Claude Code Review is still pending. In repositories where branch protection requires this consolidated check, those PRs remain blocked until a manual rerun or new push even though the review would later complete successfully.

Useful? React with 👍 / 👎.

name: AI Review

on:
pull_request:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Trigger the gate when draft PRs become ready

With a bare pull_request trigger, GitHub only runs this workflow for the default opened/synchronize/reopened activity types, not ready_for_review. For PRs opened as drafts, Claude's automatic review does not run until the PR is marked ready, so this gate can time out on the draft and publishing it won't start a new gate run; the required check stays failed until a new push or manual rerun.

Useful? React with 👍 / 👎.

@codecov

codecov Bot commented Jun 4, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.27%. Comparing base (b8473e7) to head (341b6f5).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3550      +/-   ##
==========================================
- Coverage   59.18%   58.27%   -0.92%     
==========================================
  Files        2222     2144      -78     
  Lines      183555   174144    -9411     
==========================================
- Hits       108639   101475    -7164     
+ Misses      65132    63645    -1487     
+ Partials     9784     9024     -760     
Flag Coverage Δ
sei-db 70.41% <ø> (ø)
sei-db-state-db ?

Flags with carried forward coverage won't be shown. Click here to find out more.
see 141 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml
Comment thread .github/workflows/ai-review.yml
Comment thread .github/workflows/ai-review.yml
Comment thread .github/workflows/ai-review.yml
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated
Comment thread .github/workflows/ai-review.yml Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 74dd752. Configure here.

Comment thread .github/workflows/ecr.yml
@@ -1,5 +1,7 @@
name: ECR

BREAK AND SEE IF AI CATCHES THIS!!!!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental debug line in workflow

High Severity

A non-comment line BREAK AND SEE IF AI CATCHES THIS!!!! was added at the top level of .github/workflows/ecr.yml. That is not valid workflow syntax, so GitHub is likely to reject or fail to load the ECR workflow, blocking pushes to main and release/** that depend on it.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 74dd752. Configure here.

Comment on lines +91 to +94
const at = r => new Date(r.started_at ?? r.updated_at ?? r.created_at ?? 0)
const latestFor = key =>
allRuns.filter(r => r.name.toLowerCase().includes(key)).sort((a, b) => at(b) - at(a))[0]
const runs = REVIEWERS.map(r => latestFor(r.key))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The fallback chain r.started_at ?? r.updated_at ?? r.created_at ?? 0 at line 91 references fields that don't exist on the GitHub check_run REST schema — only started_at and completed_at are top-level on check_run; created_at/updated_at live on check_suite — so the chain effectively collapses to r.started_at ?? 0. When a maintainer clicks Re-run, GitHub creates a new check_run with status: queued and started_at: null, so at(newRun) = new Date(null) = epoch and the older completed run wins the descending sort; latestFor returns the stale run with its prior conclusion for the entire rerun window. Combined with the merge_group auto-pass at lines 81-84, a maintainer-triggered rerun + queue-for-merge can land the PR with a stale success before the rerun verdict arrives. Fix: prefer non-completed runs when one exists for the same reviewer, or use check_run.id (monotonically increasing per repo) as the tiebreaker.

Extended reasoning...

What the bug is

Line 91 defines the sort key for picking the latest check_run per reviewer:

const at = r => new Date(r.started_at ?? r.updated_at ?? r.created_at ?? 0)

The fallback chain was added in response to inline comment 3372143715 (and is the same shape the author proposed for the rerun-staleness problem cursor[bot] flagged in 3363611828). But per the GitHub Checks Runs REST API schema, a check_run object only has these top-level fields: id, head_sha, node_id, external_id, url, html_url, details_url, status, conclusion, started_at, completed_at, output, name, check_suite, app, pull_requests, deployment. There is no top-level created_at or updated_at — those fields exist on the nested check_suite resource. So r.updated_at and r.created_at evaluate to undefined for every run and the chain reduces to r.started_at ?? 0.

Why this matters during reruns

When a maintainer clicks Re-run on Cursor Bugbot or Claude Code Review, GitHub creates a fresh check_run with status: queued and started_at: null (per the spec, started_at is nullable on check_run; cursor[bot] confirmed this exact behavior in inline 3363611828: "a freshly rerequested run is often queued with a null started_at"). The prior completed run is still returned by listForRef. Now the sort runs:

  • at(newQueuedRerun) = new Date(null).getTime() = 0 (epoch)
  • at(oldCompletedRun) = valid started_at timestamp (≫ 0)
  • Descending sort: old completed run wins

latestFor returns the OLD run. Line 102 sees run.status === 'completed' (the old one), and the script falls through to whichever verdict branch matches the OLD run's conclusion. The newly-queued rerun is invisible to the script during its entire queued+in_progress window, and the GitHub webhook does not emit a check_run event for the queued→in_progress transition, so the workflow has no other opportunity to re-evaluate until the rerun completes.

Why existing code doesn't prevent it

The author clearly intended the fallback chain to handle this case — it was added specifically in response to the queued-started_at-is-null problem cursor[bot] flagged. The fix landed but doesn't work because the chosen fallback fields are not in the check_run schema. This is a near-miss, not a missed-case: the structural intent is correct, the field names are wrong.

Impact

Once AI Review is wired into branch protection as the consolidated required gate (the explicit goal of this PR), the rerun-window staleness has two failure modes:

  • Stale failure / action_required → gate stays red. Annoying but recoverable: the rerun completes, fires check_run.completed, the workflow re-evaluates and posts the correct verdict.
  • Stale success / neutral → gate stays green during rerun. This is the dangerous one. Lines 81-84 auto-pass on merge_group events without consulting reviewer state at all. So a maintainer who triggered a rerun and then immediately queues the PR for merge can have the merge queue fire, see AI Review = success from the stale verdict, auto-pass merge_group's own evaluation, and land the PR before the rerun's verdict arrives. If the rerun would have surfaced a regression that the prior run missed, it merges into main with the gate showing green throughout.

Step-by-step proof (stale success → unsafe merge)

  1. t=0 — Both reviewers complete success on commit X. The script writes AI Review = success.
  2. t=5m — Maintainer clicks Re-run on Cursor Bugbot via the Checks UI. GitHub creates check_run #N+1 named Cursor Bugbot on head_sha = X with status: queued, started_at: null, completed_at: null. check_run.created fires.
  3. t=5m+ε — The job runs (the [created, rerequested] subscription wakes it). listForRef returns both the old completed Bugbot run AND the new queued one. The script computes at(new) = new Date(null) = epoch and at(old) = valid started_at > 0. Descending sort: old wins. latestFor('bugbot') returns the OLD completed run.
  4. t=5m+ε — Line 102: run.status === 'completed' ✓. Line 104: REJECTING.includes('success') false. Line 106: PASSING.includes('success') ✓ — script treats it as passing. With both reviewers passing per the OLD state, line 123 runs and posts AI Review = success (or, if the pending-overwrite branch is hit instead, it remains at the prior success). Either way: gate is green.
  5. t=6m — Maintainer queues PR for merge. merge_group fires. Lines 81-84 auto-pass with setStatus("success", "Merge queue — auto-pass").
  6. t=6m+ε — Merge proceeds. Bugbot rerun has not yet completed.
  7. t=8m — Bugbot rerun completes with failure (caught a regression the prior run missed). Too late — already in main.

The runtime semantics of new Date(null).getTime() === 0 are unambiguous JS: new Date(null) equals Thu Jan 01 1970 00:00:00 GMT+0000, and Math.max(0, anyPositiveTimestamp) always picks the positive value. The trace above follows directly.

How to fix it

Several viable options, in roughly increasing order of robustness:

  1. Prefer non-completed runs when both exist for a reviewer — most directly captures the intent ("a rerun is in flight, ignore the prior verdict"):
const latestFor = key => {
  const matches = allRuns.filter(r => r.name.toLowerCase().includes(key))
  const inFlight = matches.find(r => r.status !== 'completed')
  if (inFlight) return inFlight
  return matches.sort((a, b) => new Date(b.started_at ?? 0) - new Date(a.started_at ?? 0))[0]
}
  1. Sort by check_run.id as a stable tiebreaker — check_run ids are monotonically increasing per repo, so the newest run always wins regardless of started_at nullity:
const at = r => [new Date(r.started_at ?? 0).getTime(), r.id]
const latestFor = key => allRuns
  .filter(r => r.name.toLowerCase().includes(key))
  .sort((a, b) => { const [ta, ia] = at(a), [tb, ib] = at(b); return tb - ta || ib - ia })[0]
  1. Replace the broken fallback with completed_at (which IS on the schema) — narrower fix but only useful as a tiebreaker because queued runs have completed_at: null too.

Any of these unsticks the rerun-staleness case the original fallback chain was meant to address.

Comment thread .github/workflows/ecr.yml
@@ -1,5 +1,7 @@
name: ECR

BREAK AND SEE IF AI CATCHES THIS!!!!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The PR adds the literal line BREAK AND SEE IF AI CATCHES THIS!!!! at .github/workflows/ecr.yml:3 between the name: and on: keys. This is a bare scalar at the document root mixed with mapping entries, so it is not valid YAML and GitHub Actions will mark ecr.yml as 'Invalid workflow file' — no sei-chain container images (including mock- and mock_block_validation- variants) will be built or pushed to ECR on push to main/release/** or via workflow_dispatch after merge. Remove lines 2–3 before merging.

Extended reasoning...

What the bug is

.github/workflows/ecr.yml line 3 contains the bare unquoted text BREAK AND SEE IF AI CATCHES THIS!!!! (no leading #), sandwiched between the mapping keys name: ECR on line 1 and on: on line 5. The document root is a YAML mapping (its keys are name, on, jobs), so a non-mapping scalar at the same indentation level is a syntax error — a YAML document cannot mix a bare scalar with mapping entries in the same context.

How it manifests

Running a YAML parser against the file fails:

$ python3 -c "import yaml; yaml.safe_load(open(".github/workflows/ecr.yml"))"
yaml.scanner.ScannerError: while scanning a simple key
  ...
could not find expected ":" in .github/workflows/ecr.yml, line 3, column 1

GitHub Actions uses the same YAML semantics at workflow load time, so when this lands on main/release/**, the Actions tab will show Invalid workflow file for ecr.yml and the workflow will refuse to dispatch.

Why existing code does not prevent it

The line is a top-level addition with no surrounding guard. The workflow is otherwise correct, but a single malformed line at the document root is enough for GitHub to reject the entire file — there is no partial parsing or "skip bad lines" fallback.

Impact

.github/workflows/ecr.yml is the workflow that produces and publishes the sei-chain Docker images to ECR. It triggers on:

  • push to main and release/** branches
  • workflow_dispatch

For each push, it builds and pushes three image tags to {ecr-registry}/sei/sei-chain:

  • <sha> (production)
  • mock-<sha> (mock balances build)
  • mock_block_validation-<sha> (mock block-validation build)

Once this PR merges, every push to main/release/** will silently fail to dispatch ecr.yml. No new container images will be built or pushed. Anything downstream that pulls {ecr-registry}/sei/sei-chain:<sha> for a post-merge commit will not find the image, and deployment/CI flows depending on those tags break until a follow-up PR removes the line.

Why this happened and how to fix

The latest commit on this PR is 74dd752 Test neagtive case with AI review. The commit message is explicit: this line was inserted intentionally to test whether the new consolidated AI Review gate this PR introduces catches such regressions. It must not land. cursor[bot] independently flagged the same line on the same commit (inline 3373932492, "Accidental debug line in workflow", High Severity).

Fix: delete lines 2–3 of .github/workflows/ecr.yml (the blank line and the BREAK line) so the file goes back to:

name: ECR

on:
  push:
    ...

Step-by-step proof

  1. GitHub event: a commit lands on main after this PR merges.
  2. GitHub Actions loads .github/workflows/ecr.yml.
  3. The YAML parser reaches line 3 (BREAK AND SEE IF AI CATCHES THIS!!!!).
  4. Line 1 (name: ECR) established the document root as a mapping. Line 3 begins at column 1 with a non-mapping token (no : follows on the same logical line; the next token is on: on line 5 after a blank line).
  5. Parser raises ScannerError: could not find expected ":" in .github/workflows/ecr.yml, line 3, column 1.
  6. GitHub Actions surfaces "Invalid workflow file" and refuses to dispatch ecr.yml for this commit.
  7. None of the three image tags (sei-chain:<sha>, sei-chain:mock-<sha>, sei-chain:mock_block_validation-<sha>) is built or pushed to ECR.
  8. The same outcome repeats for every subsequent push to main/release/** and every workflow_dispatch invocation, until a follow-up PR removes the line.

🔬 also observed by cursor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant