From 9595d2360cdf6f14a788f429d808e721a65ad0aa Mon Sep 17 00:00:00 2001 From: Anthony Volk Date: Sat, 18 Apr 2026 03:01:35 +0200 Subject: [PATCH] Add Codex PR review guidance docs --- .github/review/global.md | 21 +++++++++++ .github/review/segments/general.md | 11 ++++++ .../segments/priority_and_confidence.md | 36 +++++++++++++++++++ .github/review/segments/staged_prs.md | 17 +++++++++ .github/review/segments/testing.md | 12 +++++++ AGENTS.md | 32 +++++++++++++++++ changelog.d/796.added.md | 1 + 7 files changed, 130 insertions(+) create mode 100644 .github/review/global.md create mode 100644 .github/review/segments/general.md create mode 100644 .github/review/segments/priority_and_confidence.md create mode 100644 .github/review/segments/staged_prs.md create mode 100644 .github/review/segments/testing.md create mode 100644 AGENTS.md create mode 100644 changelog.d/796.added.md diff --git a/.github/review/global.md b/.github/review/global.md new file mode 100644 index 000000000..a653beb48 --- /dev/null +++ b/.github/review/global.md @@ -0,0 +1,21 @@ +# PR Review Instructions + +Primary goal: identify bugs, regressions, missing tests, contract drift, scope drift, and hidden operational risk. + +Review rules: +- Findings first. Do not lead with summary or praise. +- Prioritize behavior, correctness, migration boundaries, and release risk over style. +- Ignore purely cosmetic issues unless they hide a behavioral problem. +- Distinguish direct evidence from inference. +- Be explicit about blind spots such as unrun tests, missing optional dependencies, or unclear runtime context. + +Required structure: +- `Severity`: `high`, `medium`, or `low` +- `Confidence`: `high`, `medium`, or `low` +- `Basis`: `direct_code_evidence`, `test_evidence`, `inference`, or `missing_context` +- `Why it matters` +- `Suggested fix` + +Use `.github/review/segments/priority_and_confidence.md` for the detailed severity and confidence rubric. + +If there are no findings, say so explicitly and still list residual risks or blind spots. diff --git a/.github/review/segments/general.md b/.github/review/segments/general.md new file mode 100644 index 000000000..1499c8ad4 --- /dev/null +++ b/.github/review/segments/general.md @@ -0,0 +1,11 @@ +# General Review Segment + +Check for: +- obvious bugs and behavioral regressions +- changed control flow that no longer matches caller expectations +- signature drift between callers and callees +- data path mistakes, especially path handling, identifiers, and selection logic +- missing or misleading validation +- missing unit coverage for newly introduced logic + +Bias toward concise, actionable findings. Do not manufacture issues to fill space. diff --git a/.github/review/segments/priority_and_confidence.md b/.github/review/segments/priority_and_confidence.md new file mode 100644 index 000000000..1530c4704 --- /dev/null +++ b/.github/review/segments/priority_and_confidence.md @@ -0,0 +1,36 @@ +# Priority and Confidence Segment + +Use this segment to classify both finding priority and confidence level consistently. + +## Priority rubric + +Classify each finding as `high`, `medium`, or `low`. + +- `high` + A likely merge blocker. The issue can cause incorrect behavior, runtime failure, broken contracts, artifact corruption, publication mistakes, or materially misleading output. +- `medium` + Important, but not always a blocker. The issue can plausibly cause regressions, maintenance traps, incomplete migrations, or missing coverage around meaningful new behavior. +- `low` + Real but limited impact. The issue is worth fixing, but it is unlikely to cause immediate user-facing failure or operational damage. + +If a concern is merely stylistic or speculative, do not promote it into a finding. + +## Confidence rubric + +Classify each finding as `high`, `medium`, or `low`. + +- `high` + Directly supported by code in the diff, surrounding code, or executed tests. +- `medium` + Strong inference from the code path, but not fully validated by execution or complete context. +- `low` + Plausible concern, but evidence is incomplete or significant context is missing. + +Also state the basis for the finding: + +- `direct_code_evidence` +- `test_evidence` +- `inference` +- `missing_context` + +When confidence is not `high`, briefly say what is missing. diff --git a/.github/review/segments/staged_prs.md b/.github/review/segments/staged_prs.md new file mode 100644 index 000000000..ed7717f5b --- /dev/null +++ b/.github/review/segments/staged_prs.md @@ -0,0 +1,17 @@ +# Staged PR Review Segment + +This repository often uses staged migration PRs with narrow scope limits. + +Check for: +- scope drift beyond the intended phase +- contract breaks across staged seams +- compatibility regressions in dual-path or legacy-adapter code +- accidental schema or artifact format changes +- conflicting implementations that should have one clear owner +- code landing in the wrong layer, such as orchestration absorbing domain logic + +Call out whether each finding is: +- a true merge blocker +- a follow-up that can wait + +If the PR looks intentionally transitional, say so, but still flag broken boundaries. diff --git a/.github/review/segments/testing.md b/.github/review/segments/testing.md new file mode 100644 index 000000000..43d62a372 --- /dev/null +++ b/.github/review/segments/testing.md @@ -0,0 +1,12 @@ +# Testing Review Segment + +Check whether the PR adds focused tests for the new behavior it introduces. + +Look for: +- direct unit coverage for newly added branch logic +- overreliance on broad integration tests when a narrow unit test would be clearer +- tests that are brittle because they depend on ambient environment state +- module-reload or monkeypatch patterns that can poison the rest of the suite +- new code paths with no test exercising them + +If coverage is partial, say which production files or behaviors remain uncovered. diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..ff86b0267 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,32 @@ +# Codex Instructions + +These instructions apply repository-wide. + +## PR review workflow + +When the task is a pull request review: + +1. Read `.github/review/global.md`. +2. Always read: + - `.github/review/segments/general.md` + - `.github/review/segments/priority_and_confidence.md` +3. Inspect the changed files and selectively read these additional segments: + - `.github/review/segments/staged_prs.md` + Use when the PR touches staged-migration areas such as `modal_app/local_area.py`, `modal_app/worker_script.py`, `modal_app/pipeline.py`, `policyengine_us_data/calibration/local_h5/`, or `policyengine_us_data/calibration/validate_staging.py`. + - `.github/review/segments/testing.md` + Use when the PR changes production code or tests. +4. Prioritize bugs, regressions, contract drift, scope drift, and missing tests. +5. Present findings first. +6. For every finding, include: + - severity + - confidence + - basis + - why it matters + - suggested fix +7. If there are no findings, say so explicitly and still mention blind spots. + +## General engineering expectations + +- Prefer direct evidence over speculation. +- Flag missing execution context when confidence is limited. +- Focus on behavior and operational risk before style. diff --git a/changelog.d/796.added.md b/changelog.d/796.added.md new file mode 100644 index 000000000..fd9fc98fc --- /dev/null +++ b/changelog.d/796.added.md @@ -0,0 +1 @@ +Add repo-native Codex PR review instruction files for experimental pull request review guidance.