SpilloverDiD: Gardner GMM first-stage correction (Wave D)#462
Conversation
Closes the documented Wave B/C "SEs biased downward by a few percent"
caveat across all three vcov_type paths (HC1, Conley, cluster=<col> for
CR1) for both event_study=False AND event_study=True. Point estimates
byte-identical to Wave B/C; SE values shift upward by 1-few percent.
Documented synthesis of Butts (2021) Section 3.1 (IF construction for
spillover-aware DiD) + Gardner (2022) Section 4 (two-stage GMM sandwich)
+ Conley (1999) (spatial kernel). No reference software combines all
three -- did2s implements GMM without Conley; conleyreg/acreg implement
Conley without two-stage correction. Wave D is the synthesis.
Unified IF outer-product formula:
psi_i = gamma_hat' * X_{10,i} * eps_{10,i} - X_{2,i} * eps_{2,i}
meat = Psi' @ K @ Psi
vcov = (X_2' X_2)^{-1} @ meat @ (X_2' X_2)^{-1}
Kernel K is path-dependent: identity (HC1), block-indicator (cluster),
spatial kernel (Conley). Finite-sample multipliers: n/(n-p) for HC1,
G/(G-1) * (n-1)/(n-p) for cluster CR1, none for Conley.
Public surface:
- No new kwarg -- correction is unconditional.
- Wave D variance mode dispatch derives from the public contract:
vcov_type="conley" -> "conley"
cluster=<col> -> "cluster" (CR1)
otherwise -> "hc1"
- vcov_type="classical" now raises NotImplementedError upfront; the
Wave D synthesis has not been derived for the homoskedastic meat
structure (sigma_hat^2 * (X_10' X_10)). REGISTRY restrictions block
updated to list "classical" alongside "hc2"/"hc2_bm".
- Single-cluster sample raises ValueError ("at least 2 clusters") per
the standard CR1 rejection (mirrors linalg.py:1942).
Implementation:
- New module-level helper _compute_gmm_corrected_meat in two_stage.py
(existing _compute_gmm_variance method untouched).
- New module-level helper _build_butts_fe_design_csr in spillover.py.
- _compute_conley_meat factored out of _compute_conley_vcov in conley.py
so the same kernel-application code path handles both standard
sandwich (X * residuals) and Wave D IF outer product (Psi).
- SpilloverDiD.fit() bypasses solve_ols's vcov computation, builds the
fit-sample FE designs + eps_10/eps_2, calls the GMM helper, and
wraps with the bread sandwich. Rank-deficient column drops handled
via kept_col_mask + vcov re-inflation with NaN at dropped positions.
Wave B/C SE goldens re-pinned (_WAVE_B_GOLDEN_* -> _WAVE_D_GOLDEN_*);
pre-Wave-D references retained as _WAVE_B_UNCORRECTED_* commented
baselines for the directional inflation invariant.
New test classes:
- TestSpilloverDiDWaveDGmmCorrectedHc1Hand (hand-derived Psi on a
4-unit x 3-period over-identified panel; atol=1e-12)
- TestSpilloverDiDWaveDGmmCorrectedEventStudy (vcov shape on
event-study path)
- TestSpilloverDiDWaveDGmmCorrectedNanInferenceContract
(rank-deficient column propagation)
- TestSpilloverDiDWaveDGmmCorrectedValidatorWiring (Conley validator
fires from the new helper)
- TestSpilloverDiDWaveDGmmCorrectedFitIdempotence (clone + repeat-fit
bit-identity)
- TestSpilloverDiDWaveDPublicVarianceContract (end-to-end public
cluster=<col> CR1 routing, single-cluster rejection, classical
NotImplementedError)
All 223 existing spillover tests pass; full regression set across
spillover/two_stage/conley_vcov/estimators_vcov_type clean. Closes
the Gardner-GMM follow-up row in TODO.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Overall Assessment Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
When the kept stage-2 design saturates the sample (n_obs == effective_rank after rank-deficient drops), the HC1 multiplier n/(n-p) and the CR1 multiplier (n-1)/(n-p) are mathematically undefined. The original Wave D helper used `max(n - p_2, 1)` to clamp the denominator, which silently fabricated finite multipliers on underdetermined fits — `result.se` and per-coefficient SEs could stay finite even when only `t_stat`/`p_value`/CI were NaN-gated via `df_resid=0`. That violates the no-silent-failures contract. Fix: when n - p_2 <= 0, return NaN meat with an explicit UserWarning so the SE surface NaN-propagates consistently with the inference fields. The Conley path is unaffected (no finite-sample multiplier on that branch by convention). Tests: new `test_saturated_design_yields_nan_se_not_finite` in TestSpilloverDiDWaveDPublicVarianceContract exercises both the HC1 and CR1 paths on a synthetic n=p_2=4 Psi fixture; asserts NaN meat AND the saturation warning fires. Docs: replaced "Wave B MVP limitations" section heading at docs/api/spillover.rst with "Restrictions and follow-ups" (the section now describes the shipped Wave D variance + remaining limitations); updated the SpilloverDiD vs TwoStageDiD comparison table to label the Conley and cluster rows "(Wave D GMM-corrected sandwich)" instead of "(via solve_ols at stage 2)". All 224 spillover tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — this re-review resolves the prior P1/P0 issues, and I did not find any unmitigated diff-scoped P0 or P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…lassical restriction to api/spillover.rst Pre-this-commit, the [Unreleased] block contained internally inconsistent SpilloverDiD bullets: the Wave B and Wave C entries (added first) each said the Gardner GMM correction was "NOT applied" / "planned follow-up", while the Wave D entry (added later in the same release) shipped that exact correction. Update the Wave B and Wave C variance subsections to note explicitly that they are "Superseded by the Wave D Gardner GMM first-stage correction in this same release", cross-referencing the Wave D bullet. Also: docs/api/spillover.rst documented the HC2/HC2_BM NotImplementedError but omitted the new vcov_type="classical" restriction. Add a classical bullet alongside the HC2/HC2_BM one with the documented synthesis pointer to hc1 / conley / cluster. Docs-only change; no source / test edits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — no unmitigated P0/P1 findings. Executive Summary
Methodology No findings. The Wave D variance path matches the documented Butts/Gardner/Conley synthesis, including the explicitly documented deviations for Conley finite-sample adjustment and Code Quality No findings. Performance
Maintainability No findings. Tech Debt No findings. The prior Wave D follow-up row is correctly removed from Security No findings. Documentation/Tests
|
…ollow-up lists P2 Performance: when callers pass pre-mask integer codes that have had interior values dropped via finite_mask (a supported warn-and-drop fit), the code arrays are sparse — e.g. unit_codes = [0, 1, 3, 4] with code 2 dropped. Building X_10 on the raw codes materialized an all-zero FE column at index 2, forcing sparse_factorized onto the dense lstsq/XtX_10.toarray() fallback unnecessarily (large-memory path on big panels). Fix: re-factorize via pd.factorize at the top of _build_butts_fe_design_csr to compact the code space to 0..n_unique-1 (no-op when codes are already contiguous). Mirrors the column-space convention of TwoStageDiD._build_fe_design. P3 Docs: reconcile two stale follow-up lists where shipped Wave C/D items still appeared as "planned": - CHANGELOG Wave B bullet listed event_study=True and Gardner GMM correction under "Deferred features (planned follow-ups)" alongside genuinely-pending items (covariates, survey_design, ring_method, d_bar selection, sparse path). Add "as of Wave B ship-time" qualifier + "Shipped in same release: ..." note pointing at the Wave C/D bullets above. - llms-full.txt heading "Restrictions (Wave B MVP — planned follow-ups)" was misleading since items below included a mix of shipped (event_study, GMM correction) and pending features. Retitle to "Restrictions and Wave C/D status" and add the vcov_type=classical Wave D restriction bullet alongside the existing covariates / survey_design restrictions. All 224 spillover tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — no unmitigated P0/P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Summary
SpilloverDiD's stage-2 variance across all three vcov paths (HC1, Conley, CR1 viacluster=<col>) for bothevent_study=FalseANDevent_study=True. Point estimates byte-identical to Wave B/C; SE values shift upward by 1-few percent.TODO.md.Methodology references (required if estimator / math changes)
vcov_type="classical"raisesNotImplementedErrorupfront — the Wave D synthesis has not been derived for the homoskedastic meat structure (sigma_hat^2 * (X_10' X_10)) and the heteroskedasticity-robust IF outer-product form is the canonical Gardner formulation. Users get a clear remediation pointer to"hc1"/"conley"/cluster=<col>. Documented inREGISTRY.mdSpilloverDiD restrictions block and in thefit()docstring.conleyreg/ Wave B convention. HC1 usesn/(n-p)and CR1 usesG/(G-1) * (n-1)/(n-p).Validation
tests/test_spillover.py— new test classesTestSpilloverDiDWaveDGmmCorrectedHc1Hand(hand-derivedPsion a 4-unit × 3-period over-identified panel;atol=1e-12),TestSpilloverDiDWaveDGmmCorrectedEventStudy(vcov shape on event-study path),TestSpilloverDiDWaveDGmmCorrectedNanInferenceContract(rank-deficient column propagation),TestSpilloverDiDWaveDGmmCorrectedValidatorWiring(Conley validator fires from the new helper),TestSpilloverDiDWaveDGmmCorrectedFitIdempotence(clone + repeat-fit bit-identity),TestSpilloverDiDWaveDPublicVarianceContract(end-to-end publiccluster=<col>CR1 routing, single-cluster rejection, classical NotImplementedError). Wave B/C SE goldens re-pinned atTestSpilloverDiDEventStudyBackwardCompat(_WAVE_B_GOLDEN_*→_WAVE_D_GOLDEN_*; pre-Wave-D references retained as_WAVE_B_UNCORRECTED_*for the directional inflation invarianttest_wave_d_se_inflates_relative_to_wave_b_uncorrected)./tmp/wave_d_phase1_handderivation.py(developer-side, not committed) pins the closed-form Psi values that the test fixtures assert against.Security / privacy
🤖 Generated with Claude Code