Skip to content

feat: optional change-frequency (churn) term in risk scoring#555

Open
michael-denyer wants to merge 1 commit into
tirth8205:mainfrom
michael-denyer:feat/churn-risk-scoring
Open

feat: optional change-frequency (churn) term in risk scoring#555
michael-denyer wants to merge 1 commit into
tirth8205:mainfrom
michael-denyer:feat/churn-risk-scoring

Conversation

@michael-denyer

Copy link
Copy Markdown
Contributor

Linked issue

No linked issue — small opt-in enhancement to existing risk scoring. Happy to file a feature request via the issue form first if you'd prefer one for tracking.

What & why

compute_risk_score is currently 100% structural: flow criticality, cross-community callers, test coverage, security keywords, and caller count. Hotspot research (Adam Tornhill, Your Code as a Crime Scene) shows defect risk correlates with structural complexity × change frequency — the temporal half was missing.

This PR adds an opt-in change-frequency (churn) term:

  • compute_file_churn(repo_root, window_days=None) — per-file commit counts over a trailing window (default 90 days, configurable via CRG_CHURN_WINDOW_DAYS, matching the existing CRG_* env-var pattern). Parsed natively from git log --numstatno new dependencies. Same subprocess hardening as the adjacent diff parsers (arg list, timeout, errors="replace", empty dict on any failure).
  • compute_risk_score(..., churn_counts=None) — optional term contributing min(commits / 10, 1.0) × 0.15. Omitted or empty → byte-identical scores.
  • analyze_changes(..., include_churn=False) and detect-changes --churn — opt-in plumbing. Churn keys are stored under both repo-relative and absolute paths, mirroring the existing Windows: detect-changes CLI maps 0 functions while the MCP detect_changes_tool maps them correctly #528 diff-key remap, so lookups work however the graph stored node paths.

Default-off by design: structural-only scores, the GitHub Action output, and the published benchmark numbers are unchanged unless the flag is passed.

Deliberately left out to keep the diff small (one-line follow-ups if wanted): the MCP detect_changes_tool parameter and update --brief churn support.

How it was tested

13 new tests in tests/test_changes.py following the suite's conventions: numstat parsing (multi-commit counting, binary files, non-numstat noise), churn counts from a real two-commit git repo, error paths (non-git dir, subprocess failure, non-positive window), saturation cap at exactly 0.15, backward compatibility of the old call signature, relative/absolute path matching, and churn-never-computed-by-default plumbing.

uv run pytest tests/ --tb=short -q
# 1397 passed, 1 skipped, 2 xpassed

uv run ruff check code_review_graph/
# All checks passed!

uv run mypy code_review_graph/ --ignore-missing-imports --no-strict-optional
# Success: no issues found in 62 source files

bandit -r code_review_graph/ -c pyproject.toml
# 0 issues

Manual end-to-end check on this repository (graph built at HEAD): detect-changes --base HEAD~1 produces identical scores with the flag off (0.55); with --churn, files with ≥10 commits in the last 90 days gain exactly +0.15 (0.70).

Checklist

  • Tests added for new functionality
  • All tests pass: uv run pytest tests/ --tb=short -q
  • Linting passes: uv run ruff check code_review_graph/
  • Type checking passes: uv run mypy code_review_graph/ --ignore-missing-imports --no-strict-optional
  • Lines are at most 100 characters
  • Docs updated where behavior changed (README, docs/, docstrings)

Risk scores were 100% structural (flows, communities, test coverage,
security keywords, caller count). Hotspot research (Tornhill) shows
defect risk correlates with structural complexity x change frequency —
the temporal half was missing.

Add compute_file_churn(): per-file commit counts over a trailing window
(CRG_CHURN_WINDOW_DAYS, default 90 days), parsed natively from
git log --numstat with no new dependencies. compute_risk_score() takes
an optional churn_counts mapping contributing up to 0.15 (saturating at
10 commits). Opt-in via analyze_changes(include_churn=True) or
detect-changes --churn; default-off so existing structural-only scores
and published benchmark numbers are unchanged.
@michael-denyer

Copy link
Copy Markdown
Contributor Author

Note on the failing review check: the graph build and analysis completed, but the sticky-comment upsert hit gh: Resource not accessible by integration (HTTP 403)GITHUB_TOKEN is read-only in pull_request workflows triggered from forks, so the action can't post the comment on fork PRs. All other checks (lint, type-check, security, schema-sync, tests 3.10–3.13) are green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant