Add release-bundle refresh helper + CLI wrapper by MaxGhenis · Pull Request #309 · PolicyEngine/policyengine.py

MaxGhenis · 2026-04-20T17:07:01Z

Summary

Packages the release-bundle bump process so it's one command instead of six manually-coordinated file edits.

New policyengine.provenance.bundle module exposing refresh_release_bundle(country, model_version=..., data_version=...). Given optional new versions, it:

Fetches fresh wheel metadata (URL + sha256) from the PyPI JSON API
Streams the HF dataset file to compute its sha256
Writes updated data/release_manifests/{country}.json in place (model_package + data_package + certified_data_artifact + certification fields; preserves unknown fields untouched)
Bumps the pyproject.toml pin for the country extra
Skips fetches when the respective version hasn't changed, so data-only and model-only refreshes each hit exactly the one network endpoint they need

Companion regenerate_trace_tro(country) chains the TRO-sidecar rebuild, wrapping the existing scripts/generate_trace_tros.py code path.

scripts/refresh_release_bundle.py is a thin argparse wrapper:

python scripts/refresh_release_bundle.py --country us --data-version 1.83.4

Tests

6 new tests in tests/test_bundle_refresh.py, all offline via mocked urlopen:

model-only bump (PyPI only; HF untouched)
data-only bump (HF only; PyPI untouched)
combined bump
update_pyproject=False short-circuit
PyPI "no matching wheel" error
malformed dataset URI error

427/427 total tests pass (421 existing + 6 new).

Why module vs pure script

The refresh logic is reusable beyond the CLI (e.g., CI automation, per-reform certification in the TRACE TRO flow). Putting the core in policyengine.provenance.bundle gives us:

Agent-discoverable via help(policyengine.provenance)
Pure-Python test coverage (no subprocess gymnastics)
Simple CLI surface that anyone can extend
Room to promote to policyengine release refresh on pe.cli once that CLI has a second user

What this doesn't do

Doesn't run snapshot-test rebaselining. PE_UPDATE_SNAPSHOTS=1 pytest tests/test_household_calculator_snapshot.py still needs to be a human-reviewed step — the numeric changes deserve attention, not an auto-accept.
Doesn't bump UK data. Needs a separate call once the UK data manifest has a new version; same helper.
Doesn't perform the actual us-data 1.73.0 → 1.83.4 bump. That requires a HF token with read scope for policyengine/policyengine-us-data, which my local token doesn't have. Follow-up for whoever has the right credentials.

Test plan

pytest tests/test_bundle_refresh.py passes offline
pytest tests/ (427/427) passes
ruff check / ruff format --check clean
Actual refresh run once a HF-authorized maintainer picks it up

🤖 Generated with Claude Code

New policyengine.provenance.bundle module exposing refresh_release_bundle(country, *, model_version=None, data_version=None). Given a country and optional new versions, it: - Fetches fresh wheel metadata (url + sha256) from PyPI JSON API - Streams the HF dataset file to compute its sha256 - Writes updated data/release_manifests/{country}.json in place (model_package + data_package + certified_data_artifact + certification fields, preserves unknown fields untouched) - Bumps the pyproject.toml [project.optional-dependencies] pin for the country extra - Skips PyPI / HF fetches when the respective version hasn't changed, so data-only and model-only refreshes each hit exactly the one network endpoint they need regenerate_trace_tro(country) companion runs the same code path scripts/generate_trace_tros.py uses, so the refresh flow chains cleanly to TRO regeneration. scripts/refresh_release_bundle.py is a thin argparse wrapper: python scripts/refresh_release_bundle.py \\ --country us --data-version 1.83.4 Tested offline via mocked urlopen. 6 new tests cover: model-only bump, data-only bump, combined bump, update_pyproject=False, PyPI "no matching wheel" error, malformed dataset URI error. All 427 existing tests still pass. Next step (not in this PR, requires a HF token with read scope for policyengine/policyengine-us-data): run the actual 1.73.0 -> 1.83.4 bump and regenerate snapshots.

MaxGhenis added 2 commits April 20, 2026 13:06

ruff: sort provenance/__init__.py imports

73f9b13

MaxGhenis merged commit bacde31 into main Apr 20, 2026
11 checks passed

MaxGhenis mentioned this pull request Apr 20, 2026

Bump us-data 1.73.0 → 1.78.2 + fix HF model/dataset repo detection #310

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add release-bundle refresh helper + CLI wrapper#309

Add release-bundle refresh helper + CLI wrapper#309
MaxGhenis merged 2 commits intomainfrom
refresh-release-bundle

MaxGhenis commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxGhenis commented Apr 20, 2026

Summary

Tests

Why module vs pure script

What this doesn't do

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant