[v4.0.1] Fix contrib-CTC structural reforms in US microsim path by MaxGhenis · Pull Request #300 · PolicyEngine/policyengine.py

MaxGhenis · 2026-04-20T12:14:26Z

Summary

Fixes a silent bug in the US microsim path that made dict-reforms involving gov.contrib.ctc.* parameters crash at .ensure() with:

AttributeError: 'NoneType' object has no attribute 'entity'

Concretely: Tara Watson's published CTC+EITC reform (policy 94589 on app.policyengine.org) — 42 parameters with three gov.contrib.ctc.*.in_effect = True gates — was completely unable to run through pe.us.economic_impact_analysis on the v4.0 stack.

Root cause

PolicyEngineUSLatest._build_simulation_from_dataset was instantiating entities against the module-level policyengine_us.system:

builder.populations = system.instantiate_entities()   # module system, no user reform
# ...
microsim.build_from_populations(builder.populations)

But Microsimulation.__init__ applies structural reforms (triggered by user-reform parameter gates like gov.contrib.ctc.minimum_refundable.in_effect=True) to its own tax_benefit_system — not the module one. Building populations against the module system then left reform-registered variables like ctc_minimum_refundable_amount invisible at calc time; refundable_ctc (which had been swapped to the reform version) would then crash trying to sum them.

Fix

Pass microsim.tax_benefit_system to _build_simulation_from_dataset instead of the module-level system. One-line change in src/policyengine/tax_benefit_models/us/model.py.

Evidence

On the v4.0 stack (pe-us 1.653.3 + us-data 1.73.0), running the full 42-param reform now produces:

Metric	Baseline	Reform	Δ
Federal income tax	$2,141.8B	$2,053.8B	−$87.9B
CTC	$138.2B	$236.3B	+$98.0B
Refundable CTC	$35.9B	$215.7B	+$179.8B
EITC	$61.4B	$38.9B	−$22.6B

vs. pe-us 1.601.0 (March 2026): −$69.6B (the +$18B drift is upstream model evolution, not a pe.py regression).

Tests

New tests/test_us_microsim_structural_reforms.py — builds a tiny in-memory dataset, applies the minimum-refundable reform via dict, runs the sim end-to-end. Before the fix: crashes. After: passes.

72/72 existing tests still pass.

Version

Bumped to 4.0.1 (patch — bug fix, no API changes).

🤖 Generated with Claude Code

MaxGhenis · 2026-04-20T13:10:20Z

Subagent review complete — all blockers addressed

Two reviewers ran on this PR: code-simplifier and reproducibility-reviewer. Key findings, all addressed in commit e03dc8a:

Reproducibility (blocking)

The original "module-level system stays pristine" invariant was brittle — would flip false if a future `policyengine_us` release ever shipped `ctc_minimum_refundable_amount` unconditionally. Replaced with a direct identity assertion via monkeypatch: the system passed into `_build_simulation_from_dataset` must `is` `microsim.tax_benefit_system`. That's the fix's actual contract; survives arbitrary upstream changes.

Reproducibility (follow-ups)

Parameter-change-shows-up-in-output test now covers BOTH CTC AND EITC variables, catching partial-application regressions (not only no-op ones).
Added one non-CTC gate (`gov.contrib.streamlined_eitc`) to the parametrised smoke test to prove the fix generalises beyond the CTC family.

Code-simplifier

`simple = lambda` (with `# noqa: E731`) → `def _simple`.
Fix's inline comment shortened from 6 lines to 4, kept the load-bearing "hide reform-registered variables" clause.

UK latent bug

Reviewers flagged that UK's `simulation_modifier` pattern applies reforms AFTER `Microsimulation.init`, so any UK `*.in_effect`-gated structural reform would silently no-op. Filed as UK: structural reforms gated on *.in_effect=True silently no-op via simulation_modifier path #302 for v4.0.2 / v4.1 — out of scope here.

Verification

7/7 tests pass with fix.
Fix reverted → 3/7 tests fail with clean diagnostic errors (the invariant test now reports `system is not microsim.tax_benefit_system` instead of the raw `NoneType` traceback).

Ready for merge on CI green.

Collapses the household-calculator journey into one obvious call: import policyengine as pe result = pe.us.calculate_household( people=[{"age": 35, "employment_income": 60000}], tax_unit={"filing_status": "SINGLE"}, year=2026, reform={"gov.irs.deductions.standard.amount.SINGLE": 5000}, extra_variables=["adjusted_gross_income"], ) print(result.tax_unit.income_tax, result.tax_unit.adjusted_gross_income) Design goal: a fresh coding session with no prior context and a 20-file browse budget reaches a correct number in two tool calls — one to `import policyengine as pe`, one for `pe.us.calculate_household(...)`. The old surface forced an agent to pick among three entry points (`calculate_household_impact`, `managed_microsimulation`, raw `Simulation`), build a pydantic `Input` wrapper, construct a `Policy` object with `ParameterValue`s, then dig into a `list[dict[str, Any]]` to get the number. Every one of those layers is gone. Changes: - Populate `policyengine/__init__.py` (previously empty) with `us`, `uk`, and `Simulation` accessors. - Add `tax_benefit_models/{us,uk}/household.py` with a kwargs-based `calculate_household` that builds a policyengine_us/uk Simulation with a situation dict and returns a dot-access HouseholdResult. - Add `tax_benefit_models/common/` with: - `compile_reform(dict) -> core reform dict` (scalar or `{effective_date: value}` shapes) - `dispatch_extra_variables(names)` — flat list, library looks up each name's entity via `variables_by_name` - `EntityResult(dict)` with `__getattr__` for dot access + paste-able-fix AttributeError on unknown names - `HouseholdResult(dict)` with `.to_dict()` / `.write(path)` - Add `utils/household_validation.py` that catches typo'd variable names in entity dicts with difflib close-match suggestions. - Remove `USHouseholdInput`, `UKHouseholdInput`, `USHouseholdOutput`, `UKHouseholdOutput`, and `calculate_household_impact` from both country modules (v4 breaking). - Each country __init__.py exposes `model` (the pinned `TaxBenefitModelVersion`) alongside the existing `us_latest` / `uk_latest` so agents can guess either name. - Rewrite `tests/test_household_impact.py` (19 tests) around the new API: kwargs inputs, dot-access results, flat `extra_variables`, error messages with paste-able fixes, JSON serialization. - Rewrite `tests/test_us_reform_application.py` around reform-dict inputs instead of `Policy(parameter_values=[...])`. - Update `tests/fixtures/us_reform_fixtures.py` to store household fixtures as plain kwargs dicts that splat into `calculate_household(**fixture)`. 223 tests pass locally. Downstream migration (policyengine-api-v2-alpha, the sole consumer of the 3.x surface): replace `calculate_household_impact(input, policy=p)` with `calculate_household(**input, reform=reform_dict)` — fixture script grep of call sites suggests ~25 LOC touched. The migration guide will show the before/after. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The review called out five ship-blockers. This commit fixes all five plus the three footguns: 1. Entity-aware validation. Placing `filing_status` on `people` instead of `tax_unit` now raises with the correct entity and the exact kwarg-swap to make: `tax_unit={'filing_status': <value>}`. 2. Realistic docstring examples. Top-of-module examples in us/household.py and uk/household.py are now lone-parent-with-child cases that exercise every grouping decision (state_code on household, is_tax_unit_dependent on person, would_claim_child_benefit on benunit), not single-adult-no-state cases that hide them. 3. Reform-path validation. `compile_reform` now takes `model_version` and raises with a difflib close-match suggestion on unknown parameter paths, matching the validator quality on variable names. 4. Scalar reform default date. Scalar reform values previously defaulted to `date.today().isoformat()` — a caller running a year=2026 sim mid-2026 got a mid-year effective date and a blended result. Now defaults to `{year}-01-01` (passed through from calculate_household). 5. Unexpected-kwargs catcher. UK `calculate_household(tax_unit=...)` and US `calculate_household(benunit=...)` now raise a TypeError that names the correct country-specific kwarg. Other unexpected kwargs get a difflib close-match from the allowed set. Also added: - `people=[]` check with an explicit error before the calc blows up inside policyengine_us. - Tests for all new error paths (`test__variable_on_wrong_entity`, `test__empty_people`, `test__unknown_reform_path`, `test__us_kwarg_on_uk`, `test__uk_kwarg_on_us`). 151 tests pass locally across the facade + reform + regression suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Separates release-manifest + TRACE TRO emission from the core value layer. Consumers who only need Simulation / Policy / Variable / Parameter no longer transitively import h5py through scoping_strategy / constituency_impact / local_authority_impact. File moves: - core/release_manifest.py -> provenance/manifest.py - core/trace_tro.py -> provenance/trace.py New provenance/__init__.py re-exports the public surface (get_release_manifest, build_trace_tro_from_release_bundle, serialize_trace_tro, canonical_json_bytes, etc.). core/__init__.py drops the 20 provenance re-exports and keeps only value objects (Dataset, Variable, Parameter*, Policy, Dynamic, Simulation, Region, scoping strategies, TaxBenefitModel, TaxBenefitModelVersion). Explicit core -> provenance import in tax_benefit_model_version.py. Lazy h5py: - core/scoping_strategy.py: h5py no longer at top of module; imported inside WeightReplacementStrategy.apply() only. - outputs/constituency_impact.py: same. - outputs/local_authority_impact.py: same. Internal callers migrated: - tax_benefit_models/{us,uk}/model.py - tax_benefit_models/{us,uk}/datasets.py - countries/{us,uk}/regions.py - cli.py - results/trace_tro.py - scripts/generate_trace_tros.py - tests/test_{release_manifests,trace_tro,manifest_version_mismatch}.py - docs/release-bundles.md 216 tests pass locally across the v4 surface. `from policyengine.core import Simulation` + `from policyengine.provenance import get_release_manifest` both work without h5py installed (verified by temporarily uninstalling and retrying). The full `import policyengine as pe` still pulls h5py because policyengine_us / policyengine_uk import it eagerly (upstream); that's outside our control. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two byte-identical classes split only by British/American spelling (program_name vs programme_name). Collapsed into a single policyengine.outputs.ProgramStatistics; both country analysis helpers import it from there now. Saves ~106 LOC of duplication and removes an API-surface footgun for cross-country code. Changes: - Add policyengine/outputs/program_statistics.py with the unified class. - Re-export from policyengine/outputs/__init__.py. - Delete tax_benefit_models/us/outputs.py and tax_benefit_models/uk/outputs.py. - us/__init__.py and uk/__init__.py re-export from policyengine.outputs. - uk/analysis.py: rename programme_name -> program_name, programme_statistics -> program_statistics, programmes -> programs, programme_df/collection -> program_df/collection. Field on PolicyReformAnalysis also changes. Migration for callers: - from policyengine.tax_benefit_models.uk import ProgrammeStatistics -> from policyengine.outputs import ProgramStatistics - stats.programme_name -> stats.program_name 205 tests pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pulls ~300 lines of shared init/save/load logic out of PolicyEngineUSLatest and PolicyEngineUKLatest into a MicrosimulationModelVersion base in tax_benefit_models.common. The base handles: - Release-manifest fetch + installed-version warning - Data-release certification - Variable/parameter population from the country system - save() / load() + output-dataset filepath convention - _build_entity_relationships via declared group_entities Subclasses declare country_code, package_name, group_entities, entity_variables, and implement four thin hooks (_load_system, _load_region_registry, _dataset_class, _get_runtime_data_build_metadata). run() intentionally stays per-country: the US applies reforms at Microsimulation construction and manually copies structural columns, while the UK wraps inputs as UKSingleYearDataset and applies reforms after construction. Hiding those behind a shared skeleton would mask real divergence. Behaviour preservation is guarded by a byte-level snapshot test (tests/test_base_extraction_snapshot.py) covering four US and four UK household cases plus a model-surface snapshot. All 391 tests pass with zero snapshot drift.

- README and core-concepts now lead with pe.uk/pe.us entry points and pe.uk.calculate_household / pe.us.calculate_household (flat kwargs, dot-access result, dict reforms). - economic-impact-analysis, country-models-{uk,us}, and regions-and-scoping switched from `from policyengine.tax_benefit_models...` to the top-level facade. - Removed the "Legacy filter fields" section from regions-and-scoping now that filter_field/filter_value have been dropped (v4 breaking). - dev.md package-layout diagram updated to mention common/ base, provenance/ subpackage, and the MicrosimulationModelVersion extraction. - examples/household_impact_example.py rewritten against the v4 API and verified end-to-end against both UK and US models.

Unifies the v4 reform surface: the same flat {"param.path": value} / {"param.path": {date: value}} dict already accepted by pe.{uk,us}.calculate_household(reform=...) now works on population Simulation too. Dicts are compiled to Policy / Dynamic objects in a model_validator(mode="after") using tax_benefit_model_version for parameter-path validation and dataset.year for scalar effective-date defaulting. Adds compile_reform_to_policy / compile_reform_to_dynamic helpers in tax_benefit_models.common.reform, tested directly in tests/test_dict_reforms_on_simulation.py (6 tests covering scalar defaulting, effective-date mappings, path validation, pass-through of existing Policy objects, and the "no model_version" error path). Unknown parameter paths raise with close-match suggestions (same behaviour as the household calculator) so agents don't silently get a no-op reform from a typo. 397/397 tests pass. End-to-end microsim with Simulation(policy={"gov.irs.credits.ctc.amount.base[0].amount": 3000}) produces the same -$25.5B revenue impact as the manual Policy+ParameterValue construction it replaces.

The fixture is already registered in conftest.py; pytest auto-injects it by parameter name. Importing it explicitly triggered F811.

Bumps to 4.0.0 and addresses three reviewer passes (practitioner, code-simplifier, end-to-end verification) before v4 ships: Version / branding - pyproject.toml: 3.6.0 -> 4.0.0 - release_manifests/{us,uk}.json: bundle_id and policyengine_version bumped to 4.0.0 so the bundle TRO URLs point at the right git tag - test_release_manifests.py: assertion values updated API ergonomics - Simulation class now carries a full __doc__ with the canonical dict- reform call shape; help(pe.Simulation) used to return Pydantic boiler- plate, which hid the headline v4 feature from any agent that hits help() before reading source. - RowFilterStrategy.variable_value: Union[str, int, float]. Numeric columns (state_fips, county_fips) are now scopable; "state_code" doesn't exist on enhanced_cps_2024 so docs directed users at a column that would crash. - pe.__all__ now exports `outputs` so a fresh agent can tab-complete from pe. to the Aggregate/ChangeAggregate family without reading source. Docs - README: state_code_str -> state_code (consistent with us/household.py) - core-concepts.md: "Reform as a dict" section leads, "Reform as a Policy object" relegated to the escape-hatch appendix - economic-impact-analysis.md: both US and UK examples collapsed to single-line reform dicts (was 20 lines each of Parameter/ ParameterValue boilerplate) - country-models-{us,uk}.md: "Common policy reforms" sections rewritten as one-liners (lost ~130 lines of deprecated-ceremony boilerplate) - regions-and-scoping.md: variable_name="state_code" (broken) -> variable_name="state_fips", variable_value=6 - reform.py module docstring: document the [N].amount / [N].threshold indexed-parameter convention so agents don't hit the bracket-head trap Code simplification (simplifier review) - model_version.py: except (ValueError, Exception) -> except Exception - reform.py: compile_reform_to_policy / compile_reform_to_dynamic now share a private _compile_reform_to() helper (was 25 lines of copy-paste) - simulation.py: _compile_dict_reforms loops over (field, compiler) pairs instead of branching twice by hand - tests/test_base_extraction_snapshot.py renamed to test_household_calculator_snapshot.py (matches what it actually pins, not the refactor that motivated it); fixture dir follows 397 tests pass. ruff clean.

Simulation(policy={"gov.contrib.ctc.*": ...}) was crashing at .ensure() with 'NoneType has no attribute entity' because _build_simulation_from_dataset instantiated entities against the module-level policyengine_us.system (no user reform applied) while building the population from the per-sim Microsimulation whose tax_benefit_system DID have the structural reform applied. Reform- registered variables like ctc_minimum_refundable_amount were then absent from the population's entity registry at calc time. Fix: pass microsim.tax_benefit_system to _build_simulation_from_dataset instead of the module-level system. The per-sim system already has all structural reforms (gov.contrib.ctc.minimum_refundable, per_child_phase_in, per_child_phase_out) applied by Microsimulation.__init__'s post-user-reform structural pass. Real-world impact: Tara Watson's CTC+EITC expansion (policy 94589, 42 parameters including three gov.contrib.ctc.* in_effect gates) now runs end-to-end on the v4.0 stack, producing the full -$87.9B federal income tax impact for 2025 instead of crashing. Regression guarded by tests/test_us_microsim_structural_reforms.py which builds a tiny in-memory dataset, applies the minimum-refundable reform via policy dict, and asserts the sim runs to completion. Version bumped to 4.0.1.

Adds three additional tests to guard against the same class of bug: - Parametrized per-gate smoke test across all three gov.contrib.ctc structural-reform gates (minimum_refundable, per_child_phase_in, per_child_phase_out). Each activates cleanly through a dict reform. - Invariant test that the module-level policyengine_us.system stays pristine — structural reforms must only mutate the per-sim system. If someone refactors _build_simulation_from_dataset and accidentally points it back at the module system, both ends of the invariant fire. - "Reform parameter change reaches the output" test using a plain scalar parameter override (no structural-reform machinery), so regressions where ANY parameter-value reform silently becomes a no-op get caught, not just the contrib-CTC class. Confirmed each test fails when the fix is reverted; 3/6 catch the original bug (the two per_child_phase_* reforms only update existing variables so build_from_populations doesn't trip on them — kept as smoke tests for defense-in-depth). 403/403 tests pass with fix applied.

Code-simplifier: - Inline-lambda -> def _simple (tests/test_us_microsim_structural_reforms.py L68) — drops the noqa: E731 smell. - Tighten the fix's inline comment by ~3 lines, keep the load-bearing 'hide reform-registered variables' clause. Reproducibility-reviewer (blocker): - Replace brittle "module-level system stays pristine" check with a direct identity assertion: the TaxBenefitSystem passed into _build_simulation_from_dataset must be microsim.tax_benefit_system. Uses monkeypatch to capture the argument; passes a positive assertion that the captured system has the structural-reform variable registered. Resilient to upstream policyengine_us ever shipping that variable unconditionally. Reproducibility-reviewer (follow-ups): - test__reform_parameter_change_is_reflected_in_output now asserts on BOTH CTC (base-amount change) AND EITC (phase_in_rate change) so partial-application regressions are caught, not only no-op ones. - Added one non-CTC contrib gate (gov.contrib.streamlined_eitc) to the parametrised smoke test to prove the fix generalises beyond the CTC family. Verified fail-without-fix: reverting the one-line fix breaks 3/7 tests (same set as before) with clean errors — the invariant test now reports "system is not microsim.tax_benefit_system" instead of the raw NoneType traceback, which is the actual contract. 7 tests, all pass.

MaxGhenis · 2026-04-20T13:30:30Z

Superseded by fresh rebase on main — see new PR.

MaxGhenis mentioned this pull request Apr 20, 2026

UK: structural reforms gated on *.in_effect=True silently no-op via simulation_modifier path #302

Open

MaxGhenis and others added 16 commits April 20, 2026 09:12

Snapshot test: freeze US/UK household+model outputs pre-base-extraction

3ce219b

Add changelog fragment for MicrosimulationModelVersion extraction

8366d58

Drop quotes from inline Simulation type annotations (ruff UP037)

101d6c0

ruff format + drop redundant us_test_dataset import

1865c4c

The fixture is already registered in conftest.py; pytest auto-injects it by parameter name. Importing it explicitly triggered F811.

Anonymize test to reference only the public policy ID

5db7943

MaxGhenis force-pushed the v4-fix-contrib-ctc-structural-reforms branch from e03dc8a to e5207ed Compare April 20, 2026 13:13

MaxGhenis changed the base branch from main to v4 April 20, 2026 13:13

Drop unused import + ruff format

9932520

MaxGhenis changed the base branch from v4 to main April 20, 2026 13:28

MaxGhenis closed this Apr 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v4.0.1] Fix contrib-CTC structural reforms in US microsim path#300

[v4.0.1] Fix contrib-CTC structural reforms in US microsim path#300
MaxGhenis wants to merge 17 commits intomainfrom
v4-fix-contrib-ctc-structural-reforms

MaxGhenis commented Apr 20, 2026

Uh oh!

MaxGhenis commented Apr 20, 2026

Uh oh!

MaxGhenis commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxGhenis commented Apr 20, 2026

Summary

Root cause

Fix

Evidence

Tests

Version

Uh oh!

MaxGhenis commented Apr 20, 2026

Subagent review complete — all blockers addressed

Uh oh!

MaxGhenis commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant