Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 3 additions & 6 deletions .github/workflows/pr_docs_changes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,7 @@ jobs:
steps:
- name: Checkout repo
uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 18.x
- name: Install MyST
run: npm install -g mystmd
- name: Set up Quarto
uses: quarto-dev/quarto-actions/setup@v2
- name: Test documentation builds
run: cd docs && myst build --html
run: quarto render docs
15 changes: 9 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,15 +1,18 @@
.PHONY: docs docs-serve

MYSTMD_VERSION ?= 1.8.3
MYST_CMD = npx --yes mystmd@$(MYSTMD_VERSION)
.PHONY: docs docs-serve docs-generate-reference

all: build-package

docs:
cd docs && $(MYST_CMD) build --html
quarto render docs

docs-serve:
cd docs && $(MYST_CMD) start
quarto preview docs

# Regenerate the auto-generated variable / program reference under docs/reference/.
# Run once per country model release; commits the refreshed pages alongside code.
docs-generate-reference:
python docs/_generator/build_reference.py --country us --out docs/reference/us
python docs/_generator/build_reference.py --country uk --out docs/reference/uk

install:
uv pip install -e .[dev]
Expand Down
56 changes: 40 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,26 +4,47 @@ A Python package for tax-benefit microsimulation analysis. Run policy simulation

## Quick start

### Household calculator

```python
from policyengine.core import Simulation
from policyengine.tax_benefit_models.uk import PolicyEngineUKDataset, uk_latest
from policyengine.outputs.aggregate import Aggregate, AggregateType
import policyengine as pe

# Load representative microdata
dataset = PolicyEngineUKDataset(
name="FRS 2023-24",
filepath="./data/frs_2023_24_year_2026.h5",
# UK: single adult earning £50,000
uk = pe.uk.calculate_household(
people=[{"age": 35, "employment_income": 50_000}],
year=2026,
)
print(uk.person[0].income_tax) # income tax
print(uk.household.hbai_household_net_income) # net income

# US: single filer in California, with a reform
us = pe.us.calculate_household(
people=[{"age": 35, "employment_income": 60_000}],
tax_unit={"filing_status": "SINGLE"},
household={"state_code": "CA"},
year=2026,
reform={"gov.irs.credits.ctc.amount.adult_dependent": 1000},
)
print(us.tax_unit.income_tax, us.household.household_net_income)
```

# Run simulation
simulation = Simulation(
dataset=dataset,
tax_benefit_model_version=uk_latest,
### Population analysis

```python
import policyengine as pe
from policyengine.core import Simulation
from policyengine.outputs.aggregate import Aggregate, AggregateType

datasets = pe.uk.ensure_datasets(
datasets=["hf://policyengine/policyengine-uk-data/enhanced_frs_2023_24.h5"],
years=[2026],
data_folder="./data",
)
dataset = datasets["enhanced_frs_2023_24_2026"]

simulation = Simulation(dataset=dataset, tax_benefit_model_version=pe.uk.model)
simulation.run()

# Calculate total universal credit spending
agg = Aggregate(
simulation=simulation,
variable="universal_credit",
Expand All @@ -34,6 +55,9 @@ agg.run()
print(f"Total UC spending: £{agg.result / 1e9:.1f}bn")
```

For baseline-vs-reform comparisons, see `pe.uk.economic_impact_analysis`
and its US counterpart.

## Documentation

**Core concepts:**
Expand Down Expand Up @@ -179,12 +203,12 @@ dataset.load()
Simulations apply tax-benefit models to datasets:

```python
import policyengine as pe
from policyengine.core import Simulation
from policyengine.tax_benefit_models.uk import uk_latest

simulation = Simulation(
dataset=dataset,
tax_benefit_model_version=uk_latest,
tax_benefit_model_version=pe.uk.model,
)
simulation.run()

Expand Down Expand Up @@ -223,7 +247,7 @@ import datetime

parameter = Parameter(
name="gov.hmrc.income_tax.allowances.personal_allowance.amount",
tax_benefit_model_version=uk_latest,
tax_benefit_model_version=pe.uk.model,
data_type=float,
)

Expand All @@ -242,7 +266,7 @@ policy = Policy(
# Run reform simulation
reform_sim = Simulation(
dataset=dataset,
tax_benefit_model_version=uk_latest,
tax_benefit_model_version=pe.uk.model,
policy=policy,
)
reform_sim.run()
Expand Down
1 change: 1 addition & 0 deletions changelog.d/v4-base-extraction.changed.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Extracted shared `MicrosimulationModelVersion` base class in `policyengine.tax_benefit_models.common`. Country subclasses now declare class-level metadata (`country_code`, `package_name`, `group_entities`) and implement a handful of thin hooks; `run()` stays per-country. Byte-level snapshot tests verify zero output drift.
1 change: 1 addition & 0 deletions changelog.d/v4-dict-reforms.added.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
``Simulation(policy={...})`` and ``Simulation(dynamic={...})`` now accept the same flat ``{"param.path": value}`` / ``{"param.path": {date: value}}`` dict that ``pe.{uk,us}.calculate_household(reform=...)`` accepts. Dicts are compiled to full ``Policy`` / ``Dynamic`` objects on construction using the ``tax_benefit_model_version`` for parameter-path validation and ``dataset.year`` for scalar effective-date defaulting. Removes the last place where population microsim required building ``Parameter`` / ``ParameterValue`` by hand.
1 change: 1 addition & 0 deletions changelog.d/v4-docs-refresh.changed.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Documentation refreshed for the v4 agent-first surface. README, `core-concepts`, `economic-impact-analysis`, `country-models-{uk,us}`, `regions-and-scoping`, `examples`, and `dev` now lead with `pe.uk.*` / `pe.us.*` entry points and flat-kwarg `calculate_household` usage. Removed leftover docs for the dropped `filter_field`/`filter_value` simulation fields. `examples/household_impact_example.py` rewritten against the v4 API.
47 changes: 47 additions & 0 deletions changelog.d/v4-facade.added.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
**BREAKING (v4):** Collapse the household-calculator surface into a
single agent-friendly entry point, ``pe.us.calculate_household`` /
``pe.uk.calculate_household``.

New public API:

- ``policyengine/__init__.py`` populated with canonical accessors:
``pe.us``, ``pe.uk``, ``pe.Simulation`` (replacing the empty top-level
module). ``import policyengine as pe`` now gives you everything a
new coding session needs to reach in one line.
- ``pe.us.calculate_household(**kwargs)`` and ``pe.uk.calculate_household``
take flat keyword arguments (``people``, per-entity overrides,
``year``, ``reform``, ``extra_variables``) instead of a pydantic
input wrapper.
- ``reform=`` accepts a plain dict: ``{parameter_path: value}`` or
``{parameter_path: {effective_date: value}}``. Compiles internally.
- Returns :class:`HouseholdResult` (new) with dot-access:
``result.tax_unit.income_tax``, ``result.household.household_net_income``,
``result.person[0].age``. Singleton entities are
:class:`EntityResult`; ``person`` is a list of them. ``to_dict()``
and ``write(path)`` serialize to JSON.
- ``extra_variables=[...]`` is now a flat list; the library dispatches
each name to its entity by looking it up on the model.
- Unknown variable names (in ``people``, entity overrides, or
``extra_variables``) raise ``ValueError`` with a ``difflib`` close-match
suggestion and a paste-able fix hint.
- Unknown dot-access on a result raises ``AttributeError`` with the
list of available variables plus the ``extra_variables=[...]`` call
that would surface the requested one.

Removed (v4 breaking):

- ``USHouseholdInput`` / ``UKHouseholdInput`` / ``USHouseholdOutput`` /
``UKHouseholdOutput`` pydantic wrappers.
- ``calculate_household_impact`` — the name was misleading (it
returned levels, not an impact vs. baseline). Reserved for a future
delta function.
- The bare ``us_model`` / ``uk_model`` label-only singletons; each
country module now exposes ``.model`` pointing at the real
``TaxBenefitModelVersion`` (kept ``us_latest`` / ``uk_latest``
aliases for compatibility with any in-flight downstream code).

New internal module:

- ``policyengine.tax_benefit_models.common`` — ``compile_reform``,
``dispatch_extra_variables``, ``EntityResult``, ``HouseholdResult``
shared by both country implementations.
24 changes: 24 additions & 0 deletions changelog.d/v4-provenance-package.changed.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
**BREAKING (v4):** Separate the provenance layer from the core
value-object layer.

- ``policyengine/core/release_manifest.py`` → ``policyengine/provenance/manifest.py``
- ``policyengine/core/trace_tro.py`` → ``policyengine/provenance/trace.py``
- New ``policyengine.provenance`` package re-exports the public
surface (``get_release_manifest``, ``get_data_release_manifest``,
``build_trace_tro_from_release_bundle``, ``build_simulation_trace_tro``,
``serialize_trace_tro``, ``canonical_json_bytes``,
``compute_trace_composition_fingerprint``, etc.).
- ``policyengine.core`` no longer re-exports provenance types.
``policyengine.core`` shrinks to value objects only (Dataset,
Variable, Parameter, Policy, Dynamic, Simulation, Region,
TaxBenefitModel, TaxBenefitModelVersion, scoping strategies).
- ``import policyengine.core.scoping_strategy`` no longer imports
``h5py`` at module load; the weight-replacement code path
lazy-imports it. ``import policyengine.outputs.constituency_impact``
and ``import policyengine.outputs.local_authority_impact`` do the
same.
- Migration for downstream: replace
``from policyengine.core import DataReleaseManifest`` (et al.)
with ``from policyengine.provenance import DataReleaseManifest``.
The country-module imports in internal code (``tax_benefit_models/{us,uk}/model.py``
and ``datasets.py``) are already updated.
1 change: 1 addition & 0 deletions changelog.d/variable-graph.added.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added ``policyengine.graph`` — a static-analysis-based variable dependency graph for PolicyEngine source trees. ``extract_from_path(path)`` walks a directory of Variable subclasses, parses formula-method bodies for ``entity("<var>", period)`` and ``add(entity, period, [list])`` references, and returns a ``VariableGraph``. Queries include ``deps(var)`` (direct dependencies), ``impact(var)`` (transitive downstream), and ``path(src, dst)`` (shortest dependency chain). No runtime dependency on country models — indexes ``policyengine-us`` (4,577 variables) in under a second.
7 changes: 5 additions & 2 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
# MyST build outputs
_build
# Quarto build outputs
_site
_freeze
/.quarto/
**/*.quarto_ipynb
52 changes: 52 additions & 0 deletions docs/_generator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Reference generator prototype

Auto-generates one Quarto page per variable in a country model, plus a program-coverage page, purely from metadata on the `Variable` classes and `programs.yaml`.

## Run

```bash
# Full US reference (takes a couple of minutes — 4,686 variables)
python docs/_generator/build_reference.py --country us --out docs/_generated/reference/us

# Preview a filtered subset
python docs/_generator/build_reference.py --country us --filter chip --out /tmp/ref-preview
```

Then render:

```bash
cd /tmp/ref-preview && quarto render
```

## What's generated from code alone

Per variable:

- Title and identifier
- Metadata table: entity, value type, unit, period, `defined_for` gate
- Documentation (docstring)
- Components (`adds` / `subtracts` lists)
- Statutory references (from `reference = ...`)
- Source file path and line number

Per program: a row in the generated program-coverage page pulled from `programs.yaml` (id, name, category, agency, status, coverage).

Per directory (`gov/hhs/chip/`, `gov/usda/snap/`, etc.): a listing page using Quarto's built-in directory listing so the nav auto-organizes.

## What still requires hand-authored prose

- Methodology narrative (why the model is structured this way)
- Tutorials (how to use `policyengine.py`)
- Paper content (peer-reviewable argument)
- Per-country deep dives that read as essays rather than reference lookups

## Design

The generator reads directly from the imported country model — no web API calls, no intermediate JSON. This keeps the build offline-reproducible and version-pinned to whatever country model the `policyengine.py` package has installed. Re-running the generator on release produces a snapshot of the reference docs tied to the exact published model versions.

Extensions worth considering:

1. Walk `parameters/` YAML tree and emit a page per parameter with its time series, breakdowns, and references.
2. For each variable with a formula, surface the dependency graph (other variables / parameters it reads). `policyengine_core`'s `Variable.exhaustive_parameter_dependencies` gets partway there.
3. For each calibration target (in `policyengine-us-data/storage/calibration_targets/*.csv`), emit a page describing source, aggregation level, freshness.
4. Cross-link variables to the programs they contribute to via `programs.yaml`'s `variable:` field.
Loading