Add policyengine.graph and reference-generator prototype#306
Merged
Conversation
Two related additions behind one new optional extra.
### policyengine.graph
New subpackage for querying PolicyEngine Variable dependency structure
by AST-walking source trees. No runtime dependency on country models —
the extractor is pure static analysis, so it works on any
`policyengine-us` / `policyengine-uk` checkout (or fork) regardless of
whether the jurisdiction is installed. Particularly useful in agent
sessions where the country packages may not be importable in the
sandbox.
Recognized reference patterns in v1:
- `<entity>("<var>", <period>)` calls on entity Names
(`person`, `tax_unit`, `spm_unit`, `household`, `family`,
`marital_unit`, `benunit`).
- `add(<entity>, <period>, ["v1", "v2", ...])` sum-helper list.
Limitations noted in module docstrings:
- Parameter references not yet captured (v2).
- Dynamic variable names skipped (low prevalence).
- `entity.sum("var")` method calls not yet recognized (v2).
### Reference generator prototype
`docs/_generator/build_reference.py` walks a country model's
`TaxBenefitSystem` and writes one `.qmd` page per variable grouped by
its parameter-tree path. Also emits a program-coverage page from
`programs.yaml`. The generator reads everything from the imported
country model — no web API calls, no cached JSON — which keeps the
build offline-reproducible and pinned to whatever country model
version the `policyengine` package has installed.
Run against a CHIP subset of `policyengine-us`, the generator emits
34 variable pages + 1 programs page + 56 directory indices in under
a second; Quarto compiles all of them cleanly.
### Optional extra
`pip install policyengine[graph]` pulls in networkx; base install
stays lean. `policyengine.graph.graph` raises an informative
`ImportError` when networkx is missing, pointing at the extra.
### Testing
9/9 graph extractor tests pass (`tests/test_graph/`). Tests use
synthetic source-tree fixtures; no dependency on a live country model.
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part 2 of 3 (A: toolchain swap → B: graph + generator → C: docs rewrite) per PR #301's review feedback.
Scope
policyengine.graphsubpackage — static AST-based variable dependency extractor for PolicyEngine source trees.docs/_generator/build_reference.py— reference page generator using country-model metadata.[graph]optional extra so networkx doesn't bloat base install.policyengine.graphPure static analysis — walks a directory of
.pyfiles, picks outclass Foo(Variable):definitions, and extracts edges from formula-method bodies. Recognized patterns:<entity>("<var>", <period>)— direct calls onperson,tax_unit,spm_unit,household,family,marital_unit,benunit.add(<entity>, <period>, ["v1", "v2", ...])— sum-helper list.Because it never imports user code, it works on any PolicyEngine source tree regardless of whether the jurisdiction is installed. Useful for refactor-impact analysis, CI pre-merge checks, docs generation, and agent-session introspection (where the country packages may not be importable in the sandbox).
Limitations (v2 targets)
entity.sum("var")method calls not recognized.Reference generator
docs/_generator/build_reference.pyintrospects a country model'sTaxBenefitSystemand writes one.qmdpage per variable — metadata (entity, value type, unit, period,defined_for), documentation,adds/subtractsdecomposition, statutory references, source file path.Also emits a program-coverage page from
programs.yaml. Quarto's built-in directory listings handle the per-subtree index pages automatically.Against a CHIP subset of
policyengine-us: 34 variable pages + 1 programs page + 56 directory indices, under a second to generate, Quarto compiles all of them cleanly.Optional extra
networkxis only imported when the user explicitly importspolicyengine.graph. Missingnetworkxraises a clearImportErrorpointing at the install command.Testing
tests/test_graph/test_extractor.pypass locallypolicyengine-usDependency on PR A
The generator writes
.qmdfiles, so it assumes Quarto is the docs toolchain (PR #304). No technical blocker — the.qmdextension is just a label — but reviewing order is cleaner A → B.Test plan
pip install policyengine[graph]pulls networkx; base install without the extra importspolicyenginesuccessfully (graph only errors atfrom policyengine.graph import ...)