Skip to content

Add release-bundle refresh helper + CLI wrapper#309

Merged
MaxGhenis merged 2 commits intomainfrom
refresh-release-bundle
Apr 20, 2026
Merged

Add release-bundle refresh helper + CLI wrapper#309
MaxGhenis merged 2 commits intomainfrom
refresh-release-bundle

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

Packages the release-bundle bump process so it's one command instead of six manually-coordinated file edits.

New policyengine.provenance.bundle module exposing refresh_release_bundle(country, model_version=..., data_version=...). Given optional new versions, it:

  • Fetches fresh wheel metadata (URL + sha256) from the PyPI JSON API
  • Streams the HF dataset file to compute its sha256
  • Writes updated data/release_manifests/{country}.json in place (model_package + data_package + certified_data_artifact + certification fields; preserves unknown fields untouched)
  • Bumps the pyproject.toml pin for the country extra
  • Skips fetches when the respective version hasn't changed, so data-only and model-only refreshes each hit exactly the one network endpoint they need

Companion regenerate_trace_tro(country) chains the TRO-sidecar rebuild, wrapping the existing scripts/generate_trace_tros.py code path.

scripts/refresh_release_bundle.py is a thin argparse wrapper:

python scripts/refresh_release_bundle.py --country us --data-version 1.83.4

Tests

6 new tests in tests/test_bundle_refresh.py, all offline via mocked urlopen:

  • model-only bump (PyPI only; HF untouched)
  • data-only bump (HF only; PyPI untouched)
  • combined bump
  • update_pyproject=False short-circuit
  • PyPI "no matching wheel" error
  • malformed dataset URI error

427/427 total tests pass (421 existing + 6 new).

Why module vs pure script

The refresh logic is reusable beyond the CLI (e.g., CI automation, per-reform certification in the TRACE TRO flow). Putting the core in policyengine.provenance.bundle gives us:

  • Agent-discoverable via help(policyengine.provenance)
  • Pure-Python test coverage (no subprocess gymnastics)
  • Simple CLI surface that anyone can extend
  • Room to promote to policyengine release refresh on pe.cli once that CLI has a second user

What this doesn't do

  • Doesn't run snapshot-test rebaselining. PE_UPDATE_SNAPSHOTS=1 pytest tests/test_household_calculator_snapshot.py still needs to be a human-reviewed step — the numeric changes deserve attention, not an auto-accept.
  • Doesn't bump UK data. Needs a separate call once the UK data manifest has a new version; same helper.
  • Doesn't perform the actual us-data 1.73.0 → 1.83.4 bump. That requires a HF token with read scope for policyengine/policyengine-us-data, which my local token doesn't have. Follow-up for whoever has the right credentials.

Test plan

  • pytest tests/test_bundle_refresh.py passes offline
  • pytest tests/ (427/427) passes
  • ruff check / ruff format --check clean
  • Actual refresh run once a HF-authorized maintainer picks it up

🤖 Generated with Claude Code

New policyengine.provenance.bundle module exposing
refresh_release_bundle(country, *, model_version=None, data_version=None).
Given a country and optional new versions, it:

- Fetches fresh wheel metadata (url + sha256) from PyPI JSON API
- Streams the HF dataset file to compute its sha256
- Writes updated data/release_manifests/{country}.json in place
  (model_package + data_package + certified_data_artifact +
  certification fields, preserves unknown fields untouched)
- Bumps the pyproject.toml [project.optional-dependencies] pin for
  the country extra
- Skips PyPI / HF fetches when the respective version hasn't
  changed, so data-only and model-only refreshes each hit exactly
  the one network endpoint they need

regenerate_trace_tro(country) companion runs the same code path
scripts/generate_trace_tros.py uses, so the refresh flow chains
cleanly to TRO regeneration.

scripts/refresh_release_bundle.py is a thin argparse wrapper:

    python scripts/refresh_release_bundle.py \\
        --country us --data-version 1.83.4

Tested offline via mocked urlopen. 6 new tests cover: model-only
bump, data-only bump, combined bump, update_pyproject=False,
PyPI "no matching wheel" error, malformed dataset URI error.
All 427 existing tests still pass.

Next step (not in this PR, requires a HF token with read scope for
policyengine/policyengine-us-data): run the actual 1.73.0 -> 1.83.4
bump and regenerate snapshots.
@MaxGhenis MaxGhenis merged commit bacde31 into main Apr 20, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant