diff --git a/changelog.d/correct-sipp-licensing-language.fixed.md b/changelog.d/correct-sipp-licensing-language.fixed.md new file mode 100644 index 000000000..5859e0020 --- /dev/null +++ b/changelog.d/correct-sipp-licensing-language.fixed.md @@ -0,0 +1 @@ +Clarified SIPP licensing language in `policyengine_us_data/datasets/sipp/README.md`: SIPP public-use data is unrestricted (no per-user license, agreement, or registration). Of the six upstream microdata sources the Enhanced CPS pipeline ingests (CPS, ACS, SCF, ORG, SIPP, IRS-PUF), only IRS-PUF has a genuine access restriction. Fixes #808. diff --git a/policyengine_us_data/datasets/sipp/README.md b/policyengine_us_data/datasets/sipp/README.md index 39ba48825..c30316ae7 100644 --- a/policyengine_us_data/datasets/sipp/README.md +++ b/policyengine_us_data/datasets/sipp/README.md @@ -39,3 +39,22 @@ The raw SIPP CSVs (`pu2023.csv` and the slim variant `pu2023_slim.csv`) are mirrored on the `PolicyEngine/policyengine-us-data` HuggingFace model repo and downloaded on demand when a training run is needed. They are not vendored in this Git repository. + +## Licensing + +SIPP public-use files are, as the name implies, **public-use data** — no +per-user license, data-use agreement, or registration is required to +download or redistribute them. We mirror them on our HuggingFace model +repo purely as a caching convenience (Census's own hosting is slow and +occasionally unavailable), not to work around any access restriction. + +This matters because PolicyEngine's enhanced CPS pipeline ingests several +different upstream microdata sources, and only **one** of them — +**IRS Public Use File (PUF)** — has any genuine access restriction. PUF +requires agreeing to IRS's terms of use before download, even though the +file is itself intended for public release. CPS, ACS, SCF, ORG, and SIPP +are all unrestricted public-use. If you are writing about the pipeline's +licensing posture (for a paper, replication packet, or TRACE TRO), only +IRS-PUF should appear in the restricted column. + +See issue #808 for the background on this correction.