Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .github/review/global.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# PR Review Instructions

Primary goal: identify bugs, regressions, missing tests, contract drift, scope drift, and hidden operational risk.

Review rules:
- Findings first. Do not lead with summary or praise.
- Prioritize behavior, correctness, migration boundaries, and release risk over style.
- Ignore purely cosmetic issues unless they hide a behavioral problem.
- Distinguish direct evidence from inference.
- Be explicit about blind spots such as unrun tests, missing optional dependencies, or unclear runtime context.

Required structure:
- `Severity`: `high`, `medium`, or `low`
- `Confidence`: `high`, `medium`, or `low`
- `Basis`: `direct_code_evidence`, `test_evidence`, `inference`, or `missing_context`
- `Why it matters`
- `Suggested fix`

Use `.github/review/segments/priority_and_confidence.md` for the detailed severity and confidence rubric.

If there are no findings, say so explicitly and still list residual risks or blind spots.
11 changes: 11 additions & 0 deletions .github/review/segments/general.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# General Review Segment

Check for:
- obvious bugs and behavioral regressions
- changed control flow that no longer matches caller expectations
- signature drift between callers and callees
- data path mistakes, especially path handling, identifiers, and selection logic
- missing or misleading validation
- missing unit coverage for newly introduced logic

Bias toward concise, actionable findings. Do not manufacture issues to fill space.
36 changes: 36 additions & 0 deletions .github/review/segments/priority_and_confidence.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Priority and Confidence Segment

Use this segment to classify both finding priority and confidence level consistently.

## Priority rubric

Classify each finding as `high`, `medium`, or `low`.

- `high`
A likely merge blocker. The issue can cause incorrect behavior, runtime failure, broken contracts, artifact corruption, publication mistakes, or materially misleading output.
- `medium`
Important, but not always a blocker. The issue can plausibly cause regressions, maintenance traps, incomplete migrations, or missing coverage around meaningful new behavior.
- `low`
Real but limited impact. The issue is worth fixing, but it is unlikely to cause immediate user-facing failure or operational damage.

If a concern is merely stylistic or speculative, do not promote it into a finding.

## Confidence rubric

Classify each finding as `high`, `medium`, or `low`.

- `high`
Directly supported by code in the diff, surrounding code, or executed tests.
- `medium`
Strong inference from the code path, but not fully validated by execution or complete context.
- `low`
Plausible concern, but evidence is incomplete or significant context is missing.

Also state the basis for the finding:

- `direct_code_evidence`
- `test_evidence`
- `inference`
- `missing_context`

When confidence is not `high`, briefly say what is missing.
17 changes: 17 additions & 0 deletions .github/review/segments/staged_prs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Staged PR Review Segment

This repository often uses staged migration PRs with narrow scope limits.

Check for:
- scope drift beyond the intended phase
- contract breaks across staged seams
- compatibility regressions in dual-path or legacy-adapter code
- accidental schema or artifact format changes
- conflicting implementations that should have one clear owner
- code landing in the wrong layer, such as orchestration absorbing domain logic

Call out whether each finding is:
- a true merge blocker
- a follow-up that can wait

If the PR looks intentionally transitional, say so, but still flag broken boundaries.
12 changes: 12 additions & 0 deletions .github/review/segments/testing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Testing Review Segment

Check whether the PR adds focused tests for the new behavior it introduces.

Look for:
- direct unit coverage for newly added branch logic
- overreliance on broad integration tests when a narrow unit test would be clearer
- tests that are brittle because they depend on ambient environment state
- module-reload or monkeypatch patterns that can poison the rest of the suite
- new code paths with no test exercising them

If coverage is partial, say which production files or behaviors remain uncovered.
32 changes: 32 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Codex Instructions

These instructions apply repository-wide.

## PR review workflow

When the task is a pull request review:

1. Read `.github/review/global.md`.
2. Always read:
- `.github/review/segments/general.md`
- `.github/review/segments/priority_and_confidence.md`
3. Inspect the changed files and selectively read these additional segments:
- `.github/review/segments/staged_prs.md`
Use when the PR touches staged-migration areas such as `modal_app/local_area.py`, `modal_app/worker_script.py`, `modal_app/pipeline.py`, `policyengine_us_data/calibration/local_h5/`, or `policyengine_us_data/calibration/validate_staging.py`.
- `.github/review/segments/testing.md`
Use when the PR changes production code or tests.
4. Prioritize bugs, regressions, contract drift, scope drift, and missing tests.
5. Present findings first.
6. For every finding, include:
- severity
- confidence
- basis
- why it matters
- suggested fix
7. If there are no findings, say so explicitly and still mention blind spots.

## General engineering expectations

- Prefer direct evidence over speculation.
- Flag missing execution context when confidence is limited.
- Focus on behavior and operational risk before style.
1 change: 1 addition & 0 deletions changelog.d/796.added.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add repo-native Codex PR review instruction files for experimental pull request review guidance.
Loading