Skip to content

feat(release): warn in Slack about PRs in a release that lack QA verdict#35762

Open
nollymar wants to merge 3 commits into
mainfrom
feat/release-slack-qa-warning
Open

feat(release): warn in Slack about PRs in a release that lack QA verdict#35762
nollymar wants to merge 3 commits into
mainfrom
feat/release-slack-qa-warning

Conversation

@nollymar
Copy link
Copy Markdown
Member

@nollymar nollymar commented May 19, 2026

Summary

Adds a QA-coverage warning to the Slack notification we send when a release is generated. For every PR in the release, the new tool inspects its closing-issue references and reports whether the linked issue carries a recognized QA label.

Driven by the question: "which changes shipped in this release haven't been tested?"

How it works

New standalone tool at .github/scripts/release-qa-status/ (TypeScript, mirrors the shape of gather-release-data so the two can be consolidated later).

For each PR between the previous and current release tag:

  • excluded — bot/dependency-bump/release-machinery → skipped
  • otherwise look up closing-issue references via GitHub's GraphQL closingIssuesReferences (catches body keywords and Development-panel links) and bucket:
    • failed — any linked issue carries QA : Failed
    • missing — linked but no recognized QA label
    • unlinked — no closing-issue reference at all
    • external — only cross-repo closing refs (e.g. dotCMS/private-issues#N) — cannot verify QA from here
    • passed — every linked issue is QA : Passed or QA : Not Needed

The release workflow (.github/workflows/cicd_6-release.yml) Report job now installs and runs the tool with --format slack, capturing the output into SLACK_MESSAGE. The step is continue-on-error: true — any failure leaves the announcement intact without a QA section, never blocking a release.

Slack message (when something is flagged)

Healthy releases keep the current short message. Otherwise the announcement gains a section like:

:warning: *QA Coverage* — 6 PRs need review
  failed QA: 0  |  missing QA: 5  |  Orphan PRs: 1  |  Not in the core repo: 0

:warning: *Missing QA (5)*
  Linked issue exists but has no `QA : Passed` / `QA : Not Needed` / `QA : Failed`.
  • <pr-url|#35463> feat(maintenance): thread dump endpoints — @hassandotcms → <issue-url|#35205>
  …

:grey_question: *Orphan PRs (1)*
  PR has no closing-issue reference — QA cannot be verified.
  • <pr-url|#35609> feat(opensearch): SearchAPI router — @fabrizzio-dotCMS

Validation

Ran the tool locally against recent releases:

Release failed missing unlinked external passed excluded
v26.05.19-01 0 5 1 0 0 0
v26.05.18-01 0 5 0 1 4 0
v26.05.11-01 0 1 1 0 1 0

Spot-checked every verdict against gh issue view — labels and statuses line up.

Dry-ran the new bash step locally with the same logic the workflow will execute; the multi-line GITHUB_OUTPUT capture round-trips correctly into the Slack action's env var.

Follow-up

Filed #35763 to migrate gather-release-data to the same GraphQL approach so release-notes bullets stop missing their issue cross-links. Scoped separately to keep this PR focused.

Test plan

  • Run workflow_dispatch on cicd_6-release.yml with notify_slack=false against a recent release tag — verifies the new steps are correctly gated.
  • Run workflow_dispatch with notify_slack=true against a release whose previous tag is auto-resolvable — verify the QA section renders in #release Slack channel.
  • Simulate a failure (e.g. invalid token) and confirm the announcement still goes out without a QA section.

🤖 Generated with Claude Code

Add a standalone TypeScript tool at .github/scripts/release-qa-status that,
given a release tag, inspects every PR in the range and reports a QA verdict
per PR by reading the labels of its closing-issue references (sourced from
GitHub's GraphQL closingIssuesReferences — covers body keywords AND the
Development panel).

Buckets:
  - failed:   any linked issue carries `QA : Failed`
  - missing:  linked but no recognized QA label
  - unlinked: no closing-issue reference at all
  - external: only cross-repo closing refs (cannot verify QA from here)
  - passed:   every linked issue is `QA : Passed` or `QA : Not Needed`
  - excluded: bots / dependency bumps / release machinery

Wire the tool into the release pipeline's Report step
(cicd_6-release.yml). The QA section is appended to the existing Slack
release announcement only when there is something to flag, and the step is
non-blocking — any failure leaves the announcement intact without a QA
section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 19, 2026

Claude finished @nollymar's task in 3m 33s —— View job


Review

A few things worth a second look.

Bugs / correctness

1. Stale "more than 50" warning text — src/github.ts:214
Page size was bumped to first: 100 in round 2 but the warning string still reads "more than 50". Either harmless cosmetic noise or genuine confusion when someone hits it. Derive from the constant or update the literal.

2. Closing-issue refs are not paginated — src/github.ts:212-216
If a PR has >100 same-repo closing references, the overflow is silently dropped and only a warning is printed. The aggregation in qa.ts:34-36 could then flip a real passed PR to missing if the only QA : Passed issue happens to live in the dropped tail. Extremely unlikely in practice (no real PR closes 100+ issues), but the silent-correctness-loss footprint is worth knowing exists.

3. Commit pagination + per_page: 250 is a latent bug — src/github.ts:77,96
GitHub's compare endpoint caps per_page at 100. With perPage = 250:

if (response.data.commits.length < perPage) break;

fires after the first page (because 100 < 250), even when totalCommits > 100. The outer while (commits.length < totalCommits) would otherwise iterate correctly — the inner break is what truncates. Releases >100 commits would silently lose tail commits → underreported PR set.

This mirrors gather-release-data/src/github.ts:98,117 (same exact pattern), so it's consistent with prior art and likely never bites because dotCMS releases stay small. Worth fixing both, or at least adding a stderr warning when totalCommits > 100.

Minor

4. escapeSlack ignores |src/format.ts:195-201
Slack uses | as the link-text separator inside <url|text> and the renderer emits exactly that form. A PR title containing | (e.g. feat: parser handles a|b syntax) would land inside a <...|...> and could distort the rendered link. Probably rare; cheap to fix (.replace(/\|/g, '/') or similar).

5. RELEASE_MACHINERY_PATTERNS includes ^merge branch / ^merge pull requestsrc/exclusions.ts:38-39
These only match merge-commit titles, but the PR-extraction regex \(#(\d+)\)\s*$ already only matches squash-merge subjects. So these patterns are dead in practice (they'd only fire if someone wrote a feature PR titled "Merge branch foo"). Not harmful, just inert.

6. Asymmetric error handling between batches — src/github.ts
fetchClosingIssueRefs re-throws on GraphQL error (kills the run, drops QA section — intentional per the comment). fetchIssueInfos uses allSettled (resilient). fetchPRDetails catches per-PR and continues. The chosen behavior is defensible (the comment explains the closing-refs case), but the inconsistency is non-obvious to future readers — worth a one-liner in fetchPRDetails / fetchIssueInfos noting why those two use allSettled while closing-refs doesn't.

Looks clean

  • Workflow continue-on-error chain (Setup Node → npm ci → compute → Slack step degrades to no-QA-section).
  • LTS / CLI / non-v tag gating on every QA step.
  • Heredoc round-trip for empty qa_warning (the if [ -n "${output}" ] skip avoids the trailing blank line correctly).
  • safeBatch integer filter at the GraphQL alias trust boundary.
  • tags[0] fallback when toTag isn't yet indexed.
  • Aggregate rules in qa.ts (failed-wins, ignored notFound/PR refs, same-repo wins over external).
  • Test coverage on the parts most likely to break (bot detection, label aggregation, Slack escaping, header escalation).

Failure isolation: every QA-coverage step in the Report job is now
continue-on-error: true so a transient `actions/setup-node` or `npm ci`
failure cannot block the Slack release announcement.

LTS / CLI skip: the QA steps now bail early on non-standard tags
(`_lts_`, `dotcms-cli-`, anything not v-prefixed), matching the existing
AI release-notes phase.

GraphQL failure handling: closingIssuesReferences batch errors are
re-thrown instead of swallowed — silently returning empty refs caused
every PR in the failing batch to be misclassified as `unlinked` and
spammed the Slack channel with bogus orphan-PR rows. The outer
continue-on-error still drops the QA section if anything throws.

Bot detection: tightened login matching from .includes() to
.startsWith() on a known-prefix list, so accounts like
`fan-of-github-actions` are no longer treated as bots.

Slack message polish: header emoji escalates to 🚨 when
any PR failed QA; titles longer than 140 chars end with an ellipsis;
the workflow no longer emits a trailing blank line on healthy releases
(empty `qa_warning` round-trips cleanly through the quoted SLACK_MESSAGE).

Text output: excluded PRs now render with their reason for easier
debugging.

Documentation: added comments calling out the squash-merge assumption
on `extractPRNumbers` and the GraphQL alias safety constraint.

Tests: added jest + ts-jest with unit coverage for `classifyExclusion`,
`computePRQA`, and `renderSlack` (26 cases). The pure aggregation rules,
bot exclusion edge cases, and Slack rendering are now verified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nollymar
Copy link
Copy Markdown
Member Author

Thanks for the thorough review. Addressed in 4750742:

Fixed

  • update with latest SVN #1 Failure isolation — every QA step (Setup Node, npm ci, compute) now has continue-on-error: true, so a transient npm-registry hiccup can no longer block the announcement.
  • Test Branch and Commit #2 GraphQL batch swallowing — re-throws now instead of returning empty refs; the outer continue-on-error still drops the QA section cleanly. Prevents the "bogus orphan-PR Slack spam" failure mode.
  • Add .gitignore #3 LTS / CLI tags — the QA steps now skip non-standard tags (_lts_, dotcms-cli-, non-v-prefixed), matching the existing AI release-notes phase. No more silent exit-1.
  • New Issue #4 Squash-merge assumption — added a comment on extractPRNumbers noting the dependency on dotCMS's squash-merge policy.
  • wezell's issue #5 Tests — added jest + ts-jest with 26 unit cases covering classifyExclusion, computePRQA, and renderSlack (aggregation rules, bot edge cases, Slack escaping/truncation, emoji escalation).
  • test rwqrwqrwqr wq #6 Bot-login detection — tightened from .includes() to .startsWith() on a known-prefix list. fan-of-github-actions is no longer treated as a bot. Added a test that locks this in.
  • testing assign #7 🚨 when failed > 0 — header emoji escalates so an actual QA failure pops in the channel.
  • this is added #8 Excluded reason in text outputrenderText now lists the excluded bucket with the bot-author / dependency-bump / release-machinery reason.
  • this is added #9 Trailing blank line — bash now skips printf when output is empty, and SLACK_MESSAGE reverted to the original single-line quoted form so empty qa_warning round-trips cleanly.
  • Create SiteSearch / Publishing Framework #10 Ellipsis on truncated titles appended when title > 140 chars.
  • Upgrade CMIS LIbrary and Implement CMIS Client #11 GraphQL alias safety — added a comment confirming PR numbers come from a strict integer regex (extractPRNumbers), so the alias/argument interpolation is safe by construction. GraphQL aliases can't be parameterized.

Tests all green (26/26).

Tag-list race (#1): when the just-cut release tag isn't yet visible in
the GitHub releases API (eventual consistency), fall back to the newest
indexed tag as the predecessor instead of exiting 1 silently. The
listReleases response is sorted newest-first, so tags[0] is the previous
standard release. Also added a guard for an empty tag list.

allSettled on issue fetch (#3): switched fetchIssueInfos from
Promise.all to Promise.allSettled so a single transient 5xx on an
issue lookup can no longer drop the entire QA section. Failed entries
are treated the same as notFound — qa.ts already ignores both.

Closing-refs page size (#2): bumped first: 50 to first: 100 in
closingIssuesReferences. PRs with >100 closing refs still emit the
hasNextPage stderr warning but the cap is now generous enough that no
real PR will hit it.

Squash-merge silent failure (#4): if extractPRNumbers returns zero for
a non-empty commit list, emit an explicit stderr warning naming the
likely cause (merge strategy change on `main`). Keeps the failure mode
visible if dotCMS ever moves off squash-merge.

GraphQL alias safety (#5): added a Number.isInteger guard in the alias
builder. Type-system-wise nothing stopped a future caller from passing
NaN or a negative integer; the guard enforces the trust boundary at
fetchClosingIssueRefs rather than relying on the comment.

Backtick escape in Slack output (#6): `escapeSlack` now replaces
backticks with apostrophes so PR titles like "bump deps to `v1.2.3`"
don't render the version as monospace in Slack.

Excluded blurb (#9): renderText excluded-section blurb now lists all
four exclusion reasons (bot / dependency-bump / version-bump /
release-machinery) so it matches what classifyExclusion actually
returns.

Tests: added cases for same-repo missing winning over coexistent
externalRefs (#10) and for backtick replacement in Slack titles. 28/28
pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nollymar
Copy link
Copy Markdown
Member Author

Second round addressed in 37c1f69:

  • update with latest SVN #1 tag-list race — if toTag isn't yet in the releases API response, fall back to tags[0] (newest indexed) as the predecessor. listReleases returns newest-first, so this preserves the QA section's signal on the first release post-merge instead of silently dropping it.
  • Add .gitignore #3 Promise.allSettled in fetchIssueInfos — one transient 5xx on issues.get can no longer take down the whole batch. Failed entries are treated like notFound; qa.ts already ignores both when aggregating.
  • Test Branch and Commit #2 page size — bumped first: 50first: 100. The hasNextPage warning still fires if a PR ever exceeds that.
  • New Issue #4 squash-merge regression detection — explicit stderr warning when commits.length > 0 && prNumbers.length === 0, naming the likely cause (merge-strategy change). The QA section silently vanishing would now have a logged smoking gun.
  • wezell's issue #5 Number.isInteger guard — added in the alias builder so the trust boundary is enforced at fetchClosingIssueRefs rather than living in a comment.
  • test rwqrwqrwqr wq #6 backtick escapeescapeSlack now replaces backticks with apostrophes; PR titles like bump deps to \v1.2.3`` no longer render as Slack monospace.
  • this is added #9 blurb alignmentrenderText excluded blurb now lists all four reasons (bot / dependency-bump / version-bump / release-machinery).
  • Create SiteSearch / Publishing Framework #10 missing-vs-external coexistence — added a test asserting that same-repo missing wins over coexistent externalRefs, with the external ref still surfaced for context.

Not addressed:

  • testing assign #7 empty-result race — review confirmed clean; no change needed.
  • this is added #8 wasted API calls for excluded PRs — flagged as Minor / Harmless. Skipping the GraphQL lookup for excluded PRs would require moving classifyExclusion into the fetch path, which couples modules. Acknowledging the trade-off.

Tests: 28/28 pass.

@nollymar nollymar added this pull request to the merge queue May 20, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : CI/CD PR changes GitHub Actions/workflows

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants