PolicyEngine · MaxGhenis · Apr 19, 2026 · Apr 19, 2026 · Apr 19, 2026 · Apr 19, 2026
diff --git a/.github/workflows/pr_docs_changes.yaml b/.github/workflows/pr_docs_changes.yaml
@@ -18,10 +18,7 @@ jobs:
     steps:
       - name: Checkout repo
         uses: actions/checkout@v4
-      - uses: actions/setup-node@v4
-        with:
-          node-version: 18.x
-      - name: Install MyST
-        run: npm install -g mystmd
+      - name: Set up Quarto
+        uses: quarto-dev/quarto-actions/setup@v2
       - name: Test documentation builds
-        run: cd docs && myst build --html
+        run: quarto render docs
diff --git a/Makefile b/Makefile
@@ -1,15 +1,18 @@
-.PHONY: docs docs-serve
-
-MYSTMD_VERSION ?= 1.8.3
-MYST_CMD = npx --yes mystmd@$(MYSTMD_VERSION)
+.PHONY: docs docs-serve docs-generate-reference
 
 all: build-package
 
 docs:
-	cd docs && $(MYST_CMD) build --html
+	quarto render docs
 
 docs-serve:
-	cd docs && $(MYST_CMD) start
+	quarto preview docs
+
+# Regenerate the auto-generated variable / program reference under docs/reference/.
+# Run once per country model release; commits the refreshed pages alongside code.
+docs-generate-reference:
+	python docs/_generator/build_reference.py --country us --out docs/reference/us
+	python docs/_generator/build_reference.py --country uk --out docs/reference/uk
 
 install:
 	uv pip install -e .[dev]

diff --git a/README.md b/README.md
@@ -4,26 +4,47 @@ A Python package for tax-benefit microsimulation analysis. Run policy simulation
 
 ## Quick start
 
+### Household calculator
+
 ```python
-from policyengine.core import Simulation
-from policyengine.tax_benefit_models.uk import PolicyEngineUKDataset, uk_latest
-from policyengine.outputs.aggregate import Aggregate, AggregateType
+import policyengine as pe
 
-# Load representative microdata
-dataset = PolicyEngineUKDataset(
-    name="FRS 2023-24",
-    filepath="./data/frs_2023_24_year_2026.h5",
+# UK: single adult earning £50,000
+uk = pe.uk.calculate_household(
+    people=[{"age": 35, "employment_income": 50_000}],
     year=2026,
 )
+print(uk.person[0].income_tax)                   # income tax
+print(uk.household.hbai_household_net_income)    # net income
+
+# US: single filer in California, with a reform
+us = pe.us.calculate_household(
+    people=[{"age": 35, "employment_income": 60_000}],
+    tax_unit={"filing_status": "SINGLE"},
+    household={"state_code": "CA"},
+    year=2026,
+    reform={"gov.irs.credits.ctc.amount.adult_dependent": 1000},
+)
+print(us.tax_unit.income_tax, us.household.household_net_income)
+```
 
-# Run simulation
-simulation = Simulation(
-    dataset=dataset,
-    tax_benefit_model_version=uk_latest,
+### Population analysis
+
+```python
+import policyengine as pe
+from policyengine.core import Simulation
+from policyengine.outputs.aggregate import Aggregate, AggregateType
+
+datasets = pe.uk.ensure_datasets(
+    datasets=["hf://policyengine/policyengine-uk-data/enhanced_frs_2023_24.h5"],
+    years=[2026],
+    data_folder="./data",
 )
+dataset = datasets["enhanced_frs_2023_24_2026"]
+
+simulation = Simulation(dataset=dataset, tax_benefit_model_version=pe.uk.model)
 simulation.run()
 
-# Calculate total universal credit spending
 agg = Aggregate(
     simulation=simulation,
     variable="universal_credit",
@@ -34,6 +55,9 @@ agg.run()
 print(f"Total UC spending: £{agg.result / 1e9:.1f}bn")
 ```
 
+For baseline-vs-reform comparisons, see `pe.uk.economic_impact_analysis`
+and its US counterpart.
+
 ## Documentation
 
 **Core concepts:**
@@ -179,12 +203,12 @@ dataset.load()
 Simulations apply tax-benefit models to datasets:
 
 ```python
+import policyengine as pe
 from policyengine.core import Simulation
-from policyengine.tax_benefit_models.uk import uk_latest
 
 simulation = Simulation(
     dataset=dataset,
-    tax_benefit_model_version=uk_latest,
+    tax_benefit_model_version=pe.uk.model,
 )
 simulation.run()
 
@@ -223,7 +247,7 @@ import datetime
 
 parameter = Parameter(
     name="gov.hmrc.income_tax.allowances.personal_allowance.amount",
-    tax_benefit_model_version=uk_latest,
+    tax_benefit_model_version=pe.uk.model,
     data_type=float,
 )
 
@@ -242,7 +266,7 @@ policy = Policy(
 # Run reform simulation
 reform_sim = Simulation(
     dataset=dataset,
-    tax_benefit_model_version=uk_latest,
+    tax_benefit_model_version=pe.uk.model,
     policy=policy,
 )
 reform_sim.run()

diff --git a/changelog.d/v4-base-extraction.changed.md b/changelog.d/v4-base-extraction.changed.md
@@ -0,0 +1 @@
+Extracted shared `MicrosimulationModelVersion` base class in `policyengine.tax_benefit_models.common`. Country subclasses now declare class-level metadata (`country_code`, `package_name`, `group_entities`) and implement a handful of thin hooks; `run()` stays per-country. Byte-level snapshot tests verify zero output drift.
diff --git a/changelog.d/v4-dict-reforms.added.md b/changelog.d/v4-dict-reforms.added.md
@@ -0,0 +1 @@
+``Simulation(policy={...})`` and ``Simulation(dynamic={...})`` now accept the same flat ``{"param.path": value}`` / ``{"param.path": {date: value}}`` dict that ``pe.{uk,us}.calculate_household(reform=...)`` accepts. Dicts are compiled to full ``Policy`` / ``Dynamic`` objects on construction using the ``tax_benefit_model_version`` for parameter-path validation and ``dataset.year`` for scalar effective-date defaulting. Removes the last place where population microsim required building ``Parameter`` / ``ParameterValue`` by hand.
diff --git a/changelog.d/v4-docs-refresh.changed.md b/changelog.d/v4-docs-refresh.changed.md
@@ -0,0 +1 @@
+Documentation refreshed for the v4 agent-first surface. README, `core-concepts`, `economic-impact-analysis`, `country-models-{uk,us}`, `regions-and-scoping`, `examples`, and `dev` now lead with `pe.uk.*` / `pe.us.*` entry points and flat-kwarg `calculate_household` usage. Removed leftover docs for the dropped `filter_field`/`filter_value` simulation fields. `examples/household_impact_example.py` rewritten against the v4 API.
diff --git a/changelog.d/v4-facade.added.md b/changelog.d/v4-facade.added.md
@@ -0,0 +1,47 @@
+**BREAKING (v4):** Collapse the household-calculator surface into a
+single agent-friendly entry point, ``pe.us.calculate_household`` /
+``pe.uk.calculate_household``.
+
+New public API:
+
+- ``policyengine/__init__.py`` populated with canonical accessors:
+  ``pe.us``, ``pe.uk``, ``pe.Simulation`` (replacing the empty top-level
+  module). ``import policyengine as pe`` now gives you everything a
+  new coding session needs to reach in one line.
+- ``pe.us.calculate_household(**kwargs)`` and ``pe.uk.calculate_household``
+  take flat keyword arguments (``people``, per-entity overrides,
+  ``year``, ``reform``, ``extra_variables``) instead of a pydantic
+  input wrapper.
+- ``reform=`` accepts a plain dict: ``{parameter_path: value}`` or
+  ``{parameter_path: {effective_date: value}}``. Compiles internally.
+- Returns :class:`HouseholdResult` (new) with dot-access:
+  ``result.tax_unit.income_tax``, ``result.household.household_net_income``,
+  ``result.person[0].age``. Singleton entities are
+  :class:`EntityResult`; ``person`` is a list of them. ``to_dict()``
+  and ``write(path)`` serialize to JSON.
+- ``extra_variables=[...]`` is now a flat list; the library dispatches
+  each name to its entity by looking it up on the model.
+- Unknown variable names (in ``people``, entity overrides, or
+  ``extra_variables``) raise ``ValueError`` with a ``difflib`` close-match
+  suggestion and a paste-able fix hint.
+- Unknown dot-access on a result raises ``AttributeError`` with the
+  list of available variables plus the ``extra_variables=[...]`` call
+  that would surface the requested one.
+
+Removed (v4 breaking):
+
+- ``USHouseholdInput`` / ``UKHouseholdInput`` / ``USHouseholdOutput`` /
+  ``UKHouseholdOutput`` pydantic wrappers.
+- ``calculate_household_impact`` — the name was misleading (it
+  returned levels, not an impact vs. baseline). Reserved for a future
+  delta function.
+- The bare ``us_model`` / ``uk_model`` label-only singletons; each
+  country module now exposes ``.model`` pointing at the real
+  ``TaxBenefitModelVersion`` (kept ``us_latest`` / ``uk_latest``
+  aliases for compatibility with any in-flight downstream code).
+
+New internal module:
+
+- ``policyengine.tax_benefit_models.common`` — ``compile_reform``,
+  ``dispatch_extra_variables``, ``EntityResult``, ``HouseholdResult``
+  shared by both country implementations.
diff --git a/changelog.d/v4-provenance-package.changed.md b/changelog.d/v4-provenance-package.changed.md
@@ -0,0 +1,24 @@
+**BREAKING (v4):** Separate the provenance layer from the core
+value-object layer.
+
+- ``policyengine/core/release_manifest.py`` → ``policyengine/provenance/manifest.py``
+- ``policyengine/core/trace_tro.py`` → ``policyengine/provenance/trace.py``
+- New ``policyengine.provenance`` package re-exports the public
+  surface (``get_release_manifest``, ``get_data_release_manifest``,
+  ``build_trace_tro_from_release_bundle``, ``build_simulation_trace_tro``,
+  ``serialize_trace_tro``, ``canonical_json_bytes``,
+  ``compute_trace_composition_fingerprint``, etc.).
+- ``policyengine.core`` no longer re-exports provenance types.
+  ``policyengine.core`` shrinks to value objects only (Dataset,
+  Variable, Parameter, Policy, Dynamic, Simulation, Region,
+  TaxBenefitModel, TaxBenefitModelVersion, scoping strategies).
+- ``import policyengine.core.scoping_strategy`` no longer imports
+  ``h5py`` at module load; the weight-replacement code path
+  lazy-imports it. ``import policyengine.outputs.constituency_impact``
+  and ``import policyengine.outputs.local_authority_impact`` do the
+  same.
+- Migration for downstream: replace
+  ``from policyengine.core import DataReleaseManifest`` (et al.)
+  with ``from policyengine.provenance import DataReleaseManifest``.
+  The country-module imports in internal code (``tax_benefit_models/{us,uk}/model.py``
+  and ``datasets.py``) are already updated.
diff --git a/changelog.d/variable-graph.added.md b/changelog.d/variable-graph.added.md
@@ -0,0 +1 @@
+Added ``policyengine.graph`` — a static-analysis-based variable dependency graph for PolicyEngine source trees. ``extract_from_path(path)`` walks a directory of Variable subclasses, parses formula-method bodies for ``entity("<var>", period)`` and ``add(entity, period, [list])`` references, and returns a ``VariableGraph``. Queries include ``deps(var)`` (direct dependencies), ``impact(var)`` (transitive downstream), and ``path(src, dst)`` (shortest dependency chain). No runtime dependency on country models — indexes ``policyengine-us`` (4,577 variables) in under a second.
diff --git a/docs/.gitignore b/docs/.gitignore
@@ -1,2 +1,5 @@
-# MyST build outputs
-_build
+# Quarto build outputs
+_site
+_freeze
+/.quarto/
+**/*.quarto_ipynb
diff --git a/docs/_generator/README.md b/docs/_generator/README.md
@@ -0,0 +1,52 @@
+# Reference generator prototype
+
+Auto-generates one Quarto page per variable in a country model, plus a program-coverage page, purely from metadata on the `Variable` classes and `programs.yaml`.
+
+## Run
+
+```bash
+# Full US reference (takes a couple of minutes — 4,686 variables)
+python docs/_generator/build_reference.py --country us --out docs/_generated/reference/us
+
+# Preview a filtered subset
+python docs/_generator/build_reference.py --country us --filter chip --out /tmp/ref-preview
+```
+
+Then render:
+
+```bash
+cd /tmp/ref-preview && quarto render
+```
+
+## What's generated from code alone
+
+Per variable:
+
+- Title and identifier
+- Metadata table: entity, value type, unit, period, `defined_for` gate
+- Documentation (docstring)
+- Components (`adds` / `subtracts` lists)
+- Statutory references (from `reference = ...`)
+- Source file path and line number
+
+Per program: a row in the generated program-coverage page pulled from `programs.yaml` (id, name, category, agency, status, coverage).
+
+Per directory (`gov/hhs/chip/`, `gov/usda/snap/`, etc.): a listing page using Quarto's built-in directory listing so the nav auto-organizes.
+
+## What still requires hand-authored prose
+
+- Methodology narrative (why the model is structured this way)
+- Tutorials (how to use `policyengine.py`)
+- Paper content (peer-reviewable argument)
+- Per-country deep dives that read as essays rather than reference lookups
+
+## Design
+
+The generator reads directly from the imported country model — no web API calls, no intermediate JSON. This keeps the build offline-reproducible and version-pinned to whatever country model the `policyengine.py` package has installed. Re-running the generator on release produces a snapshot of the reference docs tied to the exact published model versions.
+
+Extensions worth considering:
+
+1. Walk `parameters/` YAML tree and emit a page per parameter with its time series, breakdowns, and references.
+2. For each variable with a formula, surface the dependency graph (other variables / parameters it reads). `policyengine_core`'s `Variable.exhaustive_parameter_dependencies` gets partway there.
+3. For each calibration target (in `policyengine-us-data/storage/calibration_targets/*.csv`), emit a page describing source, aggregation level, freshness.
+4. Cross-link variables to the programs they contribute to via `programs.yaml`'s `variable:` field.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		Extracted shared `MicrosimulationModelVersion` base class in `policyengine.tax_benefit_models.common`. Country subclasses now declare class-level metadata (`country_code`, `package_name`, `group_entities`) and implement a handful of thin hooks; `run()` stays per-country. Byte-level snapshot tests verify zero output drift.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		``Simulation(policy={...})`` and ``Simulation(dynamic={...})`` now accept the same flat ``{"param.path": value}`` / ``{"param.path": {date: value}}`` dict that ``pe.{uk,us}.calculate_household(reform=...)`` accepts. Dicts are compiled to full ``Policy`` / ``Dynamic`` objects on construction using the ``tax_benefit_model_version`` for parameter-path validation and ``dataset.year`` for scalar effective-date defaulting. Removes the last place where population microsim required building ``Parameter`` / ``ParameterValue`` by hand.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		Documentation refreshed for the v4 agent-first surface. README, `core-concepts`, `economic-impact-analysis`, `country-models-{uk,us}`, `regions-and-scoping`, `examples`, and `dev` now lead with `pe.uk.` / `pe.us.` entry points and flat-kwarg `calculate_household` usage. Removed leftover docs for the dropped `filter_field`/`filter_value` simulation fields. `examples/household_impact_example.py` rewritten against the v4 API.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		Added ``policyengine.graph`` — a static-analysis-based variable dependency graph for PolicyEngine source trees. ``extract_from_path(path)`` walks a directory of Variable subclasses, parses formula-method bodies for ``entity("<var>", period)`` and ``add(entity, period, [list])`` references, and returns a ``VariableGraph``. Queries include ``deps(var)`` (direct dependencies), ``impact(var)`` (transitive downstream), and ``path(src, dst)`` (shortest dependency chain). No runtime dependency on country models — indexes ``policyengine-us`` (4,577 variables) in under a second.