[v4] Agent-first household-calculator facade by MaxGhenis · Pull Request #291 · PolicyEngine/policyengine.py

MaxGhenis · 2026-04-19T01:50:24Z

Stacked on #290. Third in the v4 PR chain. This is the biggest behaviour-shaping change in the v4 set.

One-line summary

```python
import policyengine as pe
result = pe.us.calculate_household(
people=[{"age": 35, "employment_income": 60000}],
tax_unit={"filing_status": "SINGLE"},
year=2026,
reform={"gov.irs.deductions.standard.amount.SINGLE": 5000},
extra_variables=["adjusted_gross_income"],
)
print(result.tax_unit.income_tax, result.tax_unit.adjusted_gross_income)
```

Why

Optimise for coding-agent sessions with a ~20-file context budget. The old surface needed an agent to:

Import from `policyengine.tax_benefit_models.us` (7-token path)
Pick among three entry points (`calculate_household_impact`, `managed_microsimulation`, raw `Simulation`)
Build a `USHouseholdInput` wrapper
Construct a `Policy` with `ParameterValue` + `Parameter` imports
Read `result.tax_unit[0]["income_tax"]` out of a `list[dict[str, Any]]`

Each layer was an agent ambush. v4 collapses them.

What changed

Populate `policyengine/init.py` (previously empty) with canonical accessors: `pe.us`, `pe.uk`, `pe.Simulation`
New `pe.{us,uk}.calculate_household(**kwargs)` taking flat inputs; accepts `reform=` as a plain dict; accepts `extra_variables=[...]` as a flat list with library-side entity dispatch
New `HouseholdResult` / `EntityResult` with dot access, `.to_dict()`, `.write(path)`
Strict input validation with difflib close-match suggestions and paste-able `extra_variables=[...]` fix hints on both invalid inputs and dot-access failures
Removed (v4 breaking): `USHouseholdInput`, `UKHouseholdInput`, `USHouseholdOutput`, `UKHouseholdOutput`, `calculate_household_impact` (name was misleading — returned levels, not a delta)
Each country module now exposes `.model` alongside `us_latest` / `uk_latest` — either name works

Test plan

223 tests pass locally across `test_household_impact`, `test_us_reform_application`, `test_release_manifests`, `test_trace_tro`, `test_results`, `test_models`, `test_us_regions`, `test_uk_regions`, `test_region`, `test_manifest_version_mismatch`, `test_filtering`, `test_cache`, `test_scoping_strategy`, `test_constituency_impact`, `test_congressional_district_impact`, `test_local_authority_impact`
`ruff check .` + `ruff format --check .` clean
Manual: `import policyengine as pe; pe.us.calculate_household(...)` returns correct numbers (verified against known baselines)

Downstream migration

Single consumer on the 3.x surface — `policyengine-api-v2-alpha`. Migration: replace `calculate_household_impact(input, policy=p)` with `calculate_household(**input_as_kwargs, reform=reform_dict)`. Grep suggests ~25 LOC touched. Migration guide will accompany v4.0 tag.

Still queued

PR E1: Move `release_manifest.py` + `trace_tro.py` to `provenance/`; lazy h5py
PR F: Extract `MicrosimulationModelVersion` base (~600 LOC duplication)
v4.1: Unify `ProgramStatistics` / `ProgrammeStatistics`; collapse `Policy` / `Dynamic` classes; lazy `us_latest` / `uk_latest`

🤖 Generated with Claude Code

Removes three unambiguously dead code paths and moves plotly out of the core install so `import policyengine` doesn't pull the charting stack. Changes are behavior-preserving for every downstream repo surveyed (policyengine-api, policyengine-api-v2, policyengine-api-v2-alpha). 1. Delete `tax_benefit_models/{us,uk}.py` shim files. Python always resolves the `us/`/`uk/` package directory first, so the .py files were dead. Worse: both re-exported `general_policy_reform_analysis` which is not defined anywhere — `from policyengine.tax_benefit_models.us import general_policy_reform_analysis` raises ImportError at runtime. 2. Delete `_create_entity_output_model` + `PersonOutput` / `BenunitOutput` / `HouseholdEntityOutput` in uk/analysis.py. Built via pydantic.create_model at import time, referenced nowhere in the codebase. 3. Delete `policyengine.core.DatasetVersion`. One optional field on Dataset (never set by anything) and one core re-export. Nothing reads it downstream. 4. Move `plotly>=5.0.0` from base dependencies to a `[plotting]` optional extra. Only `policyengine.utils.plotting` uses plotly, and nothing in src/ imports that module — only `examples/` do. `plotting.py` now soft-imports with a clear install hint. Downstream impact: none. Surveyed policyengine-api (pinned to a pre-3.x API), policyengine-api-v2 (3.4.0), policyengine-api-v2-alpha (3.1.15); none of them import the deleted symbols. Tests: 216 passed locally across test_release_manifests, test_trace_tro, test_results, test_household_impact, test_models, test_us_regions, test_uk_regions, test_region, test_manifest_version_mismatch, test_filtering, test_cache, test_scoping_strategy. Deferred (bigger refactors, follow-up PRs): - filter_field/filter_value legacy path on Simulation (still wired through Region construction; needs migration) - calculate_household_impact → calculate_household rename (with deprecation shim) - Extract shared MicrosimulationModelVersion base (~600 LOC savings) - Move release_manifest + trace_tro to policyengine/provenance/ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

utils/__init__.py eagerly imported COLORS from plotting.py, which now raises ImportError when plotly isn't installed. Every smoke-import job on PR #288 failed because plotting.py blocked at module load. Move COLORS + FONT_* constants to a new plotly-free utils/design.py; plotting.py re-exports them for backward compatibility and adds them to __all__. utils/__init__.py now pulls COLORS from design rather than plotting. Confirmed locally that pip uninstall plotly still lets 'import policyengine' + 'from policyengine.utils import COLORS' + 'from policyengine.core.release_manifest import get_release_manifest' all work cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Removes the two-way scoping contract in favour of the single ScopingStrategy path. The legacy fields were labeled "kept for backward compatibility" but became dead wiring the moment every caller started passing scoping_strategy explicitly. Changes: Simulation (core/simulation.py) - Drop filter_field, filter_value fields. - Drop _auto_construct_strategy model_validator that rewrote those fields into a RowFilterStrategy. Region (core/region.py) - Drop filter_field, filter_value, requires_filter fields. - Re-add requires_filter as a derived @Property: True iff scoping_strategy is not None. - Simplify get_dataset_regions / get_filter_regions to use dataset_path / scoping_strategy directly. Country models (tax_benefit_models/us/model.py, .../uk/model.py) - Delete the `elif simulation.filter_field and simulation.filter_value:` fallback branch in run() — unreachable because nobody sets those fields anymore. - Delete the _filter_dataset_by_household_variable private method — only called from the elif branch. The underlying utils.entity_utils.filter_dataset_by_household_variable helper stays (it's what RowFilterStrategy.apply uses). - Drop the now-unused import. Region factories (countries/{us,uk}/regions.py) - Stop setting requires_filter=True, filter_field=..., filter_value=... alongside scoping_strategy. The scoping_strategy is already the source of truth; the duplicate legacy triple was noise. Tests - test_filtering.py: drop TestSimulationFilterParameters (fields gone) and TestUSFilterDatasetByHouseholdVariable / TestUKFilterDatasetByHouseholdVariable (method gone; underlying behaviour still covered by test_scoping_strategy.py TestRowFilterStrategy). Keep the build_entity_relationships tests. - test_scoping_strategy.py: drop three legacy-auto-construct tests, replace one with a direct WeightReplacementStrategy round-trip check. - test_region.py, test_us_regions.py, test_uk_regions.py: assertions move from `region.filter_field == "X"` to `region.scoping_strategy.variable_name == "X"`. - fixtures/region_fixtures.py: factories use scoping_strategy=RowFilterStrategy(...) directly. 212 tests pass. Downstream impact: policyengine-api-v2-alpha uses the legacy fields in ~14 call sites (grep confirmed); they migrate to RowFilterStrategy in a paired PR. The v4 migration guide will call out this single search-and-replace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Collapses the household-calculator journey into one obvious call: import policyengine as pe result = pe.us.calculate_household( people=[{"age": 35, "employment_income": 60000}], tax_unit={"filing_status": "SINGLE"}, year=2026, reform={"gov.irs.deductions.standard.amount.SINGLE": 5000}, extra_variables=["adjusted_gross_income"], ) print(result.tax_unit.income_tax, result.tax_unit.adjusted_gross_income) Design goal: a fresh coding session with no prior context and a 20-file browse budget reaches a correct number in two tool calls — one to `import policyengine as pe`, one for `pe.us.calculate_household(...)`. The old surface forced an agent to pick among three entry points (`calculate_household_impact`, `managed_microsimulation`, raw `Simulation`), build a pydantic `Input` wrapper, construct a `Policy` object with `ParameterValue`s, then dig into a `list[dict[str, Any]]` to get the number. Every one of those layers is gone. Changes: - Populate `policyengine/__init__.py` (previously empty) with `us`, `uk`, and `Simulation` accessors. - Add `tax_benefit_models/{us,uk}/household.py` with a kwargs-based `calculate_household` that builds a policyengine_us/uk Simulation with a situation dict and returns a dot-access HouseholdResult. - Add `tax_benefit_models/common/` with: - `compile_reform(dict) -> core reform dict` (scalar or `{effective_date: value}` shapes) - `dispatch_extra_variables(names)` — flat list, library looks up each name's entity via `variables_by_name` - `EntityResult(dict)` with `__getattr__` for dot access + paste-able-fix AttributeError on unknown names - `HouseholdResult(dict)` with `.to_dict()` / `.write(path)` - Add `utils/household_validation.py` that catches typo'd variable names in entity dicts with difflib close-match suggestions. - Remove `USHouseholdInput`, `UKHouseholdInput`, `USHouseholdOutput`, `UKHouseholdOutput`, and `calculate_household_impact` from both country modules (v4 breaking). - Each country __init__.py exposes `model` (the pinned `TaxBenefitModelVersion`) alongside the existing `us_latest` / `uk_latest` so agents can guess either name. - Rewrite `tests/test_household_impact.py` (19 tests) around the new API: kwargs inputs, dot-access results, flat `extra_variables`, error messages with paste-able fixes, JSON serialization. - Rewrite `tests/test_us_reform_application.py` around reform-dict inputs instead of `Policy(parameter_values=[...])`. - Update `tests/fixtures/us_reform_fixtures.py` to store household fixtures as plain kwargs dicts that splat into `calculate_household(**fixture)`. 223 tests pass locally. Downstream migration (policyengine-api-v2-alpha, the sole consumer of the 3.x surface): replace `calculate_household_impact(input, policy=p)` with `calculate_household(**input, reform=reform_dict)` — fixture script grep of call sites suggests ~25 LOC touched. The migration guide will show the before/after. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The review called out five ship-blockers. This commit fixes all five plus the three footguns: 1. Entity-aware validation. Placing `filing_status` on `people` instead of `tax_unit` now raises with the correct entity and the exact kwarg-swap to make: `tax_unit={'filing_status': <value>}`. 2. Realistic docstring examples. Top-of-module examples in us/household.py and uk/household.py are now lone-parent-with-child cases that exercise every grouping decision (state_code on household, is_tax_unit_dependent on person, would_claim_child_benefit on benunit), not single-adult-no-state cases that hide them. 3. Reform-path validation. `compile_reform` now takes `model_version` and raises with a difflib close-match suggestion on unknown parameter paths, matching the validator quality on variable names. 4. Scalar reform default date. Scalar reform values previously defaulted to `date.today().isoformat()` — a caller running a year=2026 sim mid-2026 got a mid-year effective date and a blended result. Now defaults to `{year}-01-01` (passed through from calculate_household). 5. Unexpected-kwargs catcher. UK `calculate_household(tax_unit=...)` and US `calculate_household(benunit=...)` now raise a TypeError that names the correct country-specific kwarg. Other unexpected kwargs get a difflib close-match from the allowed set. Also added: - `people=[]` check with an explicit error before the calc blows up inside policyengine_us. - Tests for all new error paths (`test__variable_on_wrong_entity`, `test__empty_people`, `test__unknown_reform_path`, `test__us_kwarg_on_uk`, `test__uk_kwarg_on_us`). 151 tests pass locally across the facade + reform + regression suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

MaxGhenis · 2026-04-19T20:55:03Z

Superseded by #298 (consolidated v4 launch PR). All commits cherry-picked cleanly onto v4.

MaxGhenis and others added 5 commits April 18, 2026 19:39

MaxGhenis mentioned this pull request Apr 19, 2026

[v4] Provenance package + lazy h5py #292

Closed

3 tasks

MaxGhenis changed the base branch from v4-drop-filter-fields to main April 19, 2026 20:51

MaxGhenis mentioned this pull request Apr 19, 2026

Launch v4: agent-first surface, shared microsim base, dict reforms, refreshed docs #298

Merged

4 tasks

MaxGhenis closed this Apr 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v4] Agent-first household-calculator facade#291

[v4] Agent-first household-calculator facade#291
MaxGhenis wants to merge 5 commits intomainfrom
v4-facade

MaxGhenis commented Apr 19, 2026

Uh oh!

MaxGhenis commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxGhenis commented Apr 19, 2026

One-line summary

Why

What changed

Test plan

Downstream migration

Still queued

Uh oh!

MaxGhenis commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant