Consolidate AI review CI quality gate#3550
Conversation
Consolidate the AI reviewers into a single required CI job with workaround for merge queue.
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
PR SummaryHigh Risk Overview Also changes Reviewed by Cursor Bugbot for commit 74dd752. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c19330e836
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| with: | ||
| script: | | ||
| const sha = context.payload.pull_request.head.sha | ||
| const deadline = Date.now() + 8 * 60 * 1000 |
There was a problem hiding this comment.
Allow enough time for Claude reviews
Claude Code's Code Review docs state that reviews complete in 20 minutes on average, so this 8-minute deadline (with a 10-minute job timeout) will make normal PRs fail this required gate while Claude Code Review is still pending. In repositories where branch protection requires this consolidated check, those PRs remain blocked until a manual rerun or new push even though the review would later complete successfully.
Useful? React with 👍 / 👎.
| name: AI Review | ||
|
|
||
| on: | ||
| pull_request: |
There was a problem hiding this comment.
Trigger the gate when draft PRs become ready
With a bare pull_request trigger, GitHub only runs this workflow for the default opened/synchronize/reopened activity types, not ready_for_review. For PRs opened as drafts, Claude's automatic review does not run until the PR is marked ready, so this gate can time out on the draft and publishing it won't start a new gate run; the required check stays failed until a new push or manual rerun.
Useful? React with 👍 / 👎.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3550 +/- ##
==========================================
- Coverage 59.18% 58.27% -0.92%
==========================================
Files 2222 2144 -78
Lines 183555 174144 -9411
==========================================
- Hits 108639 101475 -7164
+ Misses 65132 63645 -1487
+ Partials 9784 9024 -760
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 74dd752. Configure here.
| @@ -1,5 +1,7 @@ | |||
| name: ECR | |||
|
|
|||
| BREAK AND SEE IF AI CATCHES THIS!!!! | |||
There was a problem hiding this comment.
Accidental debug line in workflow
High Severity
A non-comment line BREAK AND SEE IF AI CATCHES THIS!!!! was added at the top level of .github/workflows/ecr.yml. That is not valid workflow syntax, so GitHub is likely to reject or fail to load the ECR workflow, blocking pushes to main and release/** that depend on it.
Reviewed by Cursor Bugbot for commit 74dd752. Configure here.
| const at = r => new Date(r.started_at ?? r.updated_at ?? r.created_at ?? 0) | ||
| const latestFor = key => | ||
| allRuns.filter(r => r.name.toLowerCase().includes(key)).sort((a, b) => at(b) - at(a))[0] | ||
| const runs = REVIEWERS.map(r => latestFor(r.key)) |
There was a problem hiding this comment.
🔴 The fallback chain r.started_at ?? r.updated_at ?? r.created_at ?? 0 at line 91 references fields that don't exist on the GitHub check_run REST schema — only started_at and completed_at are top-level on check_run; created_at/updated_at live on check_suite — so the chain effectively collapses to r.started_at ?? 0. When a maintainer clicks Re-run, GitHub creates a new check_run with status: queued and started_at: null, so at(newRun) = new Date(null) = epoch and the older completed run wins the descending sort; latestFor returns the stale run with its prior conclusion for the entire rerun window. Combined with the merge_group auto-pass at lines 81-84, a maintainer-triggered rerun + queue-for-merge can land the PR with a stale success before the rerun verdict arrives. Fix: prefer non-completed runs when one exists for the same reviewer, or use check_run.id (monotonically increasing per repo) as the tiebreaker.
Extended reasoning...
What the bug is
Line 91 defines the sort key for picking the latest check_run per reviewer:
const at = r => new Date(r.started_at ?? r.updated_at ?? r.created_at ?? 0)The fallback chain was added in response to inline comment 3372143715 (and is the same shape the author proposed for the rerun-staleness problem cursor[bot] flagged in 3363611828). But per the GitHub Checks Runs REST API schema, a check_run object only has these top-level fields: id, head_sha, node_id, external_id, url, html_url, details_url, status, conclusion, started_at, completed_at, output, name, check_suite, app, pull_requests, deployment. There is no top-level created_at or updated_at — those fields exist on the nested check_suite resource. So r.updated_at and r.created_at evaluate to undefined for every run and the chain reduces to r.started_at ?? 0.
Why this matters during reruns
When a maintainer clicks Re-run on Cursor Bugbot or Claude Code Review, GitHub creates a fresh check_run with status: queued and started_at: null (per the spec, started_at is nullable on check_run; cursor[bot] confirmed this exact behavior in inline 3363611828: "a freshly rerequested run is often queued with a null started_at"). The prior completed run is still returned by listForRef. Now the sort runs:
at(newQueuedRerun) = new Date(null).getTime() = 0(epoch)at(oldCompletedRun) = valid started_at timestamp(≫ 0)- Descending sort: old completed run wins
latestFor returns the OLD run. Line 102 sees run.status === 'completed' (the old one), and the script falls through to whichever verdict branch matches the OLD run's conclusion. The newly-queued rerun is invisible to the script during its entire queued+in_progress window, and the GitHub webhook does not emit a check_run event for the queued→in_progress transition, so the workflow has no other opportunity to re-evaluate until the rerun completes.
Why existing code doesn't prevent it
The author clearly intended the fallback chain to handle this case — it was added specifically in response to the queued-started_at-is-null problem cursor[bot] flagged. The fix landed but doesn't work because the chosen fallback fields are not in the check_run schema. This is a near-miss, not a missed-case: the structural intent is correct, the field names are wrong.
Impact
Once AI Review is wired into branch protection as the consolidated required gate (the explicit goal of this PR), the rerun-window staleness has two failure modes:
- Stale
failure/action_required→ gate stays red. Annoying but recoverable: the rerun completes, firescheck_run.completed, the workflow re-evaluates and posts the correct verdict. - Stale
success/neutral→ gate stays green during rerun. This is the dangerous one. Lines 81-84 auto-pass onmerge_groupevents without consulting reviewer state at all. So a maintainer who triggered a rerun and then immediately queues the PR for merge can have the merge queue fire, seeAI Review = successfrom the stale verdict, auto-pass merge_group's own evaluation, and land the PR before the rerun's verdict arrives. If the rerun would have surfaced a regression that the prior run missed, it merges into main with the gate showing green throughout.
Step-by-step proof (stale success → unsafe merge)
- t=0 — Both reviewers complete
successon commit X. The script writesAI Review = success. - t=5m — Maintainer clicks Re-run on Cursor Bugbot via the Checks UI. GitHub creates check_run #N+1 named
Cursor Bugbotonhead_sha = Xwithstatus: queued,started_at: null,completed_at: null.check_run.createdfires. - t=5m+ε — The job runs (the
[created, rerequested]subscription wakes it).listForRefreturns both the old completed Bugbot run AND the new queued one. The script computesat(new) = new Date(null) = epochandat(old) = valid started_at > 0. Descending sort: old wins.latestFor('bugbot')returns the OLD completed run. - t=5m+ε — Line 102:
run.status === 'completed'✓. Line 104:REJECTING.includes('success')false. Line 106:PASSING.includes('success')✓ — script treats it as passing. With both reviewers passing per the OLD state, line 123 runs and postsAI Review = success(or, if the pending-overwrite branch is hit instead, it remains at the priorsuccess). Either way: gate is green. - t=6m — Maintainer queues PR for merge.
merge_groupfires. Lines 81-84 auto-pass withsetStatus("success", "Merge queue — auto-pass"). - t=6m+ε — Merge proceeds. Bugbot rerun has not yet completed.
- t=8m — Bugbot rerun completes with
failure(caught a regression the prior run missed). Too late — already in main.
The runtime semantics of new Date(null).getTime() === 0 are unambiguous JS: new Date(null) equals Thu Jan 01 1970 00:00:00 GMT+0000, and Math.max(0, anyPositiveTimestamp) always picks the positive value. The trace above follows directly.
How to fix it
Several viable options, in roughly increasing order of robustness:
- Prefer non-completed runs when both exist for a reviewer — most directly captures the intent ("a rerun is in flight, ignore the prior verdict"):
const latestFor = key => {
const matches = allRuns.filter(r => r.name.toLowerCase().includes(key))
const inFlight = matches.find(r => r.status !== 'completed')
if (inFlight) return inFlight
return matches.sort((a, b) => new Date(b.started_at ?? 0) - new Date(a.started_at ?? 0))[0]
}- Sort by
check_run.idas a stable tiebreaker — check_run ids are monotonically increasing per repo, so the newest run always wins regardless of started_at nullity:
const at = r => [new Date(r.started_at ?? 0).getTime(), r.id]
const latestFor = key => allRuns
.filter(r => r.name.toLowerCase().includes(key))
.sort((a, b) => { const [ta, ia] = at(a), [tb, ib] = at(b); return tb - ta || ib - ia })[0]- Replace the broken fallback with
completed_at(which IS on the schema) — narrower fix but only useful as a tiebreaker because queued runs havecompleted_at: nulltoo.
Any of these unsticks the rerun-staleness case the original fallback chain was meant to address.
| @@ -1,5 +1,7 @@ | |||
| name: ECR | |||
|
|
|||
| BREAK AND SEE IF AI CATCHES THIS!!!! | |||
There was a problem hiding this comment.
🔴 The PR adds the literal line BREAK AND SEE IF AI CATCHES THIS!!!! at .github/workflows/ecr.yml:3 between the name: and on: keys. This is a bare scalar at the document root mixed with mapping entries, so it is not valid YAML and GitHub Actions will mark ecr.yml as 'Invalid workflow file' — no sei-chain container images (including mock- and mock_block_validation- variants) will be built or pushed to ECR on push to main/release/** or via workflow_dispatch after merge. Remove lines 2–3 before merging.
Extended reasoning...
What the bug is
.github/workflows/ecr.yml line 3 contains the bare unquoted text BREAK AND SEE IF AI CATCHES THIS!!!! (no leading #), sandwiched between the mapping keys name: ECR on line 1 and on: on line 5. The document root is a YAML mapping (its keys are name, on, jobs), so a non-mapping scalar at the same indentation level is a syntax error — a YAML document cannot mix a bare scalar with mapping entries in the same context.
How it manifests
Running a YAML parser against the file fails:
$ python3 -c "import yaml; yaml.safe_load(open(".github/workflows/ecr.yml"))"
yaml.scanner.ScannerError: while scanning a simple key
...
could not find expected ":" in .github/workflows/ecr.yml, line 3, column 1
GitHub Actions uses the same YAML semantics at workflow load time, so when this lands on main/release/**, the Actions tab will show Invalid workflow file for ecr.yml and the workflow will refuse to dispatch.
Why existing code does not prevent it
The line is a top-level addition with no surrounding guard. The workflow is otherwise correct, but a single malformed line at the document root is enough for GitHub to reject the entire file — there is no partial parsing or "skip bad lines" fallback.
Impact
.github/workflows/ecr.yml is the workflow that produces and publishes the sei-chain Docker images to ECR. It triggers on:
pushtomainandrelease/**branchesworkflow_dispatch
For each push, it builds and pushes three image tags to {ecr-registry}/sei/sei-chain:
<sha>(production)mock-<sha>(mock balances build)mock_block_validation-<sha>(mock block-validation build)
Once this PR merges, every push to main/release/** will silently fail to dispatch ecr.yml. No new container images will be built or pushed. Anything downstream that pulls {ecr-registry}/sei/sei-chain:<sha> for a post-merge commit will not find the image, and deployment/CI flows depending on those tags break until a follow-up PR removes the line.
Why this happened and how to fix
The latest commit on this PR is 74dd752 Test neagtive case with AI review. The commit message is explicit: this line was inserted intentionally to test whether the new consolidated AI Review gate this PR introduces catches such regressions. It must not land. cursor[bot] independently flagged the same line on the same commit (inline 3373932492, "Accidental debug line in workflow", High Severity).
Fix: delete lines 2–3 of .github/workflows/ecr.yml (the blank line and the BREAK line) so the file goes back to:
name: ECR
on:
push:
...Step-by-step proof
- GitHub event: a commit lands on
mainafter this PR merges. - GitHub Actions loads
.github/workflows/ecr.yml. - The YAML parser reaches line 3 (
BREAK AND SEE IF AI CATCHES THIS!!!!). - Line 1 (
name: ECR) established the document root as a mapping. Line 3 begins at column 1 with a non-mapping token (no:follows on the same logical line; the next token ison:on line 5 after a blank line). - Parser raises
ScannerError: could not find expected ":" in .github/workflows/ecr.yml, line 3, column 1. - GitHub Actions surfaces "Invalid workflow file" and refuses to dispatch
ecr.ymlfor this commit. - None of the three image tags (
sei-chain:<sha>,sei-chain:mock-<sha>,sei-chain:mock_block_validation-<sha>) is built or pushed to ECR. - The same outcome repeats for every subsequent push to
main/release/**and everyworkflow_dispatchinvocation, until a follow-up PR removes the line.
🔬 also observed by cursor


Consolidate the AI reviewers into a single required CI job with workaround for merge queue.