Skip to content

SpilloverDiD: Gardner GMM first-stage correction (Wave D)#462

Merged
igerber merged 4 commits into
mainfrom
spillover-conley-wave-d-gmm-correction
May 17, 2026
Merged

SpilloverDiD: Gardner GMM first-stage correction (Wave D)#462
igerber merged 4 commits into
mainfrom
spillover-conley-wave-d-gmm-correction

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 17, 2026

Summary

  • Adds the Gardner (2022) GMM first-stage uncertainty correction to SpilloverDiD's stage-2 variance across all three vcov paths (HC1, Conley, CR1 via cluster=<col>) for both event_study=False AND event_study=True. Point estimates byte-identical to Wave B/C; SE values shift upward by 1-few percent.
  • Documented synthesis of Butts (2021) §3.1 (IF construction for spillover-aware DiD) + Gardner (2022) §4 (two-stage GMM sandwich) + Conley (1999) (spatial kernel). No reference software combines all three.
  • Closes the documented Wave B/C "SEs biased downward by a few percent" caveat. Closes the Gardner-GMM follow-up row in TODO.md.

Methodology references (required if estimator / math changes)

  • Method name(s): SpilloverDiD stage-2 variance — Gardner GMM first-stage correction (Wave D synthesis)
  • Paper / source link(s):
    • Butts, K. (2023). Difference-in-Differences with Spatial Spillovers. arXiv:2105.03737v3 (§3.1)
    • Gardner, J. (2022). Two-stage differences in differences. arXiv:2207.05943 (§4)
    • Conley, T. G. (1999). GMM Estimation with Cross-Sectional Dependence. JoE 92(1), 1-45
  • Any intentional deviations from the source (and why):
    • vcov_type="classical" raises NotImplementedError upfront — the Wave D synthesis has not been derived for the homoskedastic meat structure (sigma_hat^2 * (X_10' X_10)) and the heteroskedasticity-robust IF outer-product form is the canonical Gardner formulation. Users get a clear remediation pointer to "hc1" / "conley" / cluster=<col>. Documented in REGISTRY.md SpilloverDiD restrictions block and in the fit() docstring.
    • No finite-sample multiplier on the Conley path — preserves conleyreg / Wave B convention. HC1 uses n/(n-p) and CR1 uses G/(G-1) * (n-1)/(n-p).

Validation

  • Tests added/updated: tests/test_spillover.py — new test classes TestSpilloverDiDWaveDGmmCorrectedHc1Hand (hand-derived Psi on a 4-unit × 3-period over-identified panel; atol=1e-12), TestSpilloverDiDWaveDGmmCorrectedEventStudy (vcov shape on event-study path), TestSpilloverDiDWaveDGmmCorrectedNanInferenceContract (rank-deficient column propagation), TestSpilloverDiDWaveDGmmCorrectedValidatorWiring (Conley validator fires from the new helper), TestSpilloverDiDWaveDGmmCorrectedFitIdempotence (clone + repeat-fit bit-identity), TestSpilloverDiDWaveDPublicVarianceContract (end-to-end public cluster=<col> CR1 routing, single-cluster rejection, classical NotImplementedError). Wave B/C SE goldens re-pinned at TestSpilloverDiDEventStudyBackwardCompat (_WAVE_B_GOLDEN_*_WAVE_D_GOLDEN_*; pre-Wave-D references retained as _WAVE_B_UNCORRECTED_* for the directional inflation invariant test_wave_d_se_inflates_relative_to_wave_b_uncorrected).
  • Backtest / simulation / notebook evidence (if applicable): N/A — no tutorial changes. Hand-derivation worksheet at /tmp/wave_d_phase1_handderivation.py (developer-side, not committed) pins the closed-form Psi values that the test fixtures assert against.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

🤖 Generated with Claude Code

Closes the documented Wave B/C "SEs biased downward by a few percent"
caveat across all three vcov_type paths (HC1, Conley, cluster=<col> for
CR1) for both event_study=False AND event_study=True. Point estimates
byte-identical to Wave B/C; SE values shift upward by 1-few percent.

Documented synthesis of Butts (2021) Section 3.1 (IF construction for
spillover-aware DiD) + Gardner (2022) Section 4 (two-stage GMM sandwich)
+ Conley (1999) (spatial kernel). No reference software combines all
three -- did2s implements GMM without Conley; conleyreg/acreg implement
Conley without two-stage correction. Wave D is the synthesis.

Unified IF outer-product formula:
  psi_i  = gamma_hat' * X_{10,i} * eps_{10,i} - X_{2,i} * eps_{2,i}
  meat   = Psi' @ K @ Psi
  vcov   = (X_2' X_2)^{-1} @ meat @ (X_2' X_2)^{-1}
Kernel K is path-dependent: identity (HC1), block-indicator (cluster),
spatial kernel (Conley). Finite-sample multipliers: n/(n-p) for HC1,
G/(G-1) * (n-1)/(n-p) for cluster CR1, none for Conley.

Public surface:
- No new kwarg -- correction is unconditional.
- Wave D variance mode dispatch derives from the public contract:
    vcov_type="conley"  -> "conley"
    cluster=<col>        -> "cluster" (CR1)
    otherwise            -> "hc1"
- vcov_type="classical" now raises NotImplementedError upfront; the
  Wave D synthesis has not been derived for the homoskedastic meat
  structure (sigma_hat^2 * (X_10' X_10)). REGISTRY restrictions block
  updated to list "classical" alongside "hc2"/"hc2_bm".
- Single-cluster sample raises ValueError ("at least 2 clusters") per
  the standard CR1 rejection (mirrors linalg.py:1942).

Implementation:
- New module-level helper _compute_gmm_corrected_meat in two_stage.py
  (existing _compute_gmm_variance method untouched).
- New module-level helper _build_butts_fe_design_csr in spillover.py.
- _compute_conley_meat factored out of _compute_conley_vcov in conley.py
  so the same kernel-application code path handles both standard
  sandwich (X * residuals) and Wave D IF outer product (Psi).
- SpilloverDiD.fit() bypasses solve_ols's vcov computation, builds the
  fit-sample FE designs + eps_10/eps_2, calls the GMM helper, and
  wraps with the bread sandwich. Rank-deficient column drops handled
  via kept_col_mask + vcov re-inflation with NaN at dropped positions.

Wave B/C SE goldens re-pinned (_WAVE_B_GOLDEN_* -> _WAVE_D_GOLDEN_*);
pre-Wave-D references retained as _WAVE_B_UNCORRECTED_* commented
baselines for the directional inflation invariant.

New test classes:
- TestSpilloverDiDWaveDGmmCorrectedHc1Hand (hand-derived Psi on a
  4-unit x 3-period over-identified panel; atol=1e-12)
- TestSpilloverDiDWaveDGmmCorrectedEventStudy (vcov shape on
  event-study path)
- TestSpilloverDiDWaveDGmmCorrectedNanInferenceContract
  (rank-deficient column propagation)
- TestSpilloverDiDWaveDGmmCorrectedValidatorWiring (Conley validator
  fires from the new helper)
- TestSpilloverDiDWaveDGmmCorrectedFitIdempotence (clone + repeat-fit
  bit-identity)
- TestSpilloverDiDWaveDPublicVarianceContract (end-to-end public
  cluster=<col> CR1 routing, single-cluster rejection, classical
  NotImplementedError)

All 223 existing spillover tests pass; full regression set across
spillover/two_stage/conley_vcov/estimators_vcov_type clean. Closes
the Gardner-GMM follow-up row in TODO.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall Assessment

⚠️ Needs changes — one unmitigated P1 in the new Wave D HC1/CR1 variance path can emit finite standard errors on zero-df stage-2 fits.

Executive Summary

  • The core Wave D design is directionally aligned with the intended source-material split: Gardner provides the two-stage untreated-first-stage DiD framework, Butts provides spillover-aware DiD identification, and Conley provides spatial dependence-robust covariance estimation. (arxiv.org)
  • P1: _compute_gmm_corrected_meat() clamps the HC1/CR1 denominator with max(n - p_2, 1), so saturated fits (n_obs == effective_rank) can return finite SEs even though the documented HC1/CR1 adjustment is undefined.
  • The rest of the Wave D methodology changes look consistent with docs/methodology/REGISTRY.md; the documented deviations (vcov_type="classical" rejection and no Conley finite-sample multiplier) are properly disclosed and are not defects.
  • API docs still have stale Wave B wording and stale solve_ols variance descriptions for SpilloverDiD’s Conley/cluster rows.
  • I did not find security issues in the changed files.

Methodology

  • Severity: P1. diff_diff/two_stage.py:L208-L236, consumed from diff_diff/spillover.py:L2789-L2856, compared with the existing HC1/CR1 path at diff_diff/linalg.py:L1921-L1958. Impact: on any Wave D fit where the kept stage-2 design saturates the sample (n_obs == p_2 after rank-deficient drops), the new helper uses n / max(n - p_2, 1) and G/(G-1) * (n-1)/max(n-p_2,1). That turns an undefined HC1/CR1 small-sample correction into a finite multiplier, so result.se and per-coefficient SEs can stay finite while only t_stat/p_value/CI are NaN-gated via df_resid=0. This is incorrect variance output and is not documented in REGISTRY.md:L3048-L3069. Concrete fix: when n - p_2 <= 0, fail closed instead of clamping the denominator to 1 — e.g. return NaN meat/vcov (or raise with a clear warning/error) so the SE surface NaN-propagates consistently.
  • Severity: P3-informational. docs/methodology/REGISTRY.md:L3048-L3097. Impact: none. The PR’s explicit methodology deviations (vcov_type="classical" raising NotImplementedError, and no finite-sample multiplier on the Conley path) are documented in the registry, so I am not treating them as defects. Concrete fix: none required.

Code Quality

  • No additional diff-scoped findings beyond the variance bug above.

Performance

  • No diff-scoped performance defects identified.

Maintainability

  • No separate maintainability blocker identified in the changed code.

Tech Debt

  • No new deferrable tech-debt blocker. The main issue above is correctness, so it should not be deferred to TODO.md.

Security

  • No findings.

Documentation/Tests

  • Severity: P3. docs/api/spillover.rst:L162-L168 and docs/api/spillover.rst:L171-L189. Impact: the API page still says SpilloverDiD Conley/cluster SEs are “via solve_ols at stage 2” and still frames the section as “Wave B MVP limitations,” which no longer matches the Wave D implementation. Concrete fix: rename the section and update the comparison table / limitation text to describe the Wave D GMM-corrected sandwich path.
  • Severity: P3. tests/test_spillover.py:L4108-L4468. Impact: the new Wave D tests cover the main HC1/Conley/cluster routes, but they do not pin the n_obs == effective_rank saturation case that exposes the denominator-clamp bug. Concrete fix: add a minimal unclustered HC1 fixture and a clustered CR1 fixture where the kept stage-2 rank equals n_obs, and assert that SEs are NaN/non-finite rather than finite.

Path to Approval

  1. In diff_diff/two_stage.py:_compute_gmm_corrected_meat, replace the max(n - p_2, 1) guards with fail-closed handling when n - p_2 <= 0, so HC1/CR1 SEs cannot remain finite on saturated designs.
  2. Add regression tests for saturated Wave D fits on both the HC1 and clustered CR1 paths, asserting NaN/non-finite SEs for the top-level ATT and affected coefficient surfaces.

When the kept stage-2 design saturates the sample (n_obs == effective_rank
after rank-deficient drops), the HC1 multiplier n/(n-p) and the CR1
multiplier (n-1)/(n-p) are mathematically undefined. The original Wave D
helper used `max(n - p_2, 1)` to clamp the denominator, which silently
fabricated finite multipliers on underdetermined fits — `result.se` and
per-coefficient SEs could stay finite even when only `t_stat`/`p_value`/CI
were NaN-gated via `df_resid=0`. That violates the no-silent-failures
contract.

Fix: when n - p_2 <= 0, return NaN meat with an explicit UserWarning
so the SE surface NaN-propagates consistently with the inference fields.
The Conley path is unaffected (no finite-sample multiplier on that
branch by convention).

Tests: new `test_saturated_design_yields_nan_se_not_finite` in
TestSpilloverDiDWaveDPublicVarianceContract exercises both the HC1 and
CR1 paths on a synthetic n=p_2=4 Psi fixture; asserts NaN meat AND
the saturation warning fires.

Docs: replaced "Wave B MVP limitations" section heading at
docs/api/spillover.rst with "Restrictions and follow-ups" (the
section now describes the shipped Wave D variance + remaining
limitations); updated the SpilloverDiD vs TwoStageDiD comparison table
to label the Conley and cluster rows "(Wave D GMM-corrected sandwich)"
instead of "(via solve_ols at stage 2)".

All 224 spillover tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 4be416a8c6eccdabf6eeca950c8026ce9b91da7b


Overall Assessment

✅ Looks good — this re-review resolves the prior P1/P0 issues, and I did not find any unmitigated diff-scoped P0 or P1 findings.

Executive Summary

  • Affected method: SpilloverDiD Wave D stage-2 variance, i.e. the Gardner GMM first-stage correction across HC1 / Conley / cluster.
  • The prior saturated-design issue is fixed: the new helper now fails closed to NaN when n - p_2 <= 0 on HC1/CR1, instead of fabricating finite SEs on undefined small-sample corrections (diff_diff/two_stage.py:L209-L264, tests/test_spillover.py:L4461-L4507).
  • The earlier public-surface regressions also look fixed: cluster=<col> is routed back to CR1, and vcov_type="classical" now rejects up front with a clear message (diff_diff/spillover.py:L2205-L2226, diff_diff/spillover.py:L2771-L2807, tests/test_spillover.py:L4409-L4526).
  • Against the project’s source material, the Wave D implementation matches the documented Butts/Gardner/Conley synthesis in docs/methodology/REGISTRY.md:L3048-L3097 and the in-code variance docstrings in diff_diff/two_stage.py:L73-L143 and diff_diff/conley.py:L781-L830.
  • P3 only: the user-facing docs are updated incompletely. Some CHANGELOG.md Spillover bullets still describe the old pre-Wave-D variance caveat, and the API page still omits the new vcov_type="classical" restriction (CHANGELOG.md:L16-L17, docs/api/spillover.rst:L221-L226).
  • I could not execute the tests in this review environment because the available python lacks numpy and pytest; this review is by code inspection.

Methodology

  • No findings. The affected method is clearly identified and scoped: SpilloverDiD Wave D variance in diff_diff/two_stage.py:L56-L290 and diff_diff/spillover.py:L2692-L2836. The implementation aligns with the registry’s documented synthesis and documented deviations in docs/methodology/REGISTRY.md:L3048-L3097, including HC1/CR1 finite-sample multipliers, no Conley multiplier, and the explicit vcov_type="classical" rejection.

Code Quality

  • No findings. The refactor cleanly separates kernel application into diff_diff/conley.py:L768-L1050 and keeps the public fit-path logic readable despite the new variance routing.

Performance

  • No findings. The Conley refactor preserves the existing sparse/dense dispatch and reuses the same kernel code path rather than duplicating it.

Maintainability

  • No findings. The new helper boundaries are reasonable: _compute_gmm_corrected_meat owns the IF meat, _compute_conley_meat owns kernel application, and SpilloverDiD.fit() owns public-path dispatch.

Tech Debt

  • No findings. The main Wave D follow-up is properly closed out of TODO.md, and I did not see new untracked correctness debt introduced by this diff.

Security

  • No findings.

Documentation/Tests

  • Severity: P3
    Impact: The documentation is internally inconsistent after Wave D. In the same Unreleased changelog, older Spillover bullets still say event-study/base Spillover SEs omit the Gardner correction, while the code and registry now say Wave D ships that correction. Separately, the API page still documents the HC2/HC2_BM restriction but not the new vcov_type="classical" hard rejection (CHANGELOG.md:L16-L17, docs/api/spillover.rst:L221-L226, diff_diff/spillover.py:L2205-L2226, docs/methodology/REGISTRY.md:L3048-L3092).
    Concrete fix: Update the stale Spillover CHANGELOG.md bullets to say their variance caveat is superseded by Wave D, and add a vcov_type="classical" restriction note to docs/api/spillover.rst alongside the HC2/HC2_BM note.

…lassical restriction to api/spillover.rst

Pre-this-commit, the [Unreleased] block contained internally inconsistent
SpilloverDiD bullets: the Wave B and Wave C entries (added first) each
said the Gardner GMM correction was "NOT applied" / "planned follow-up",
while the Wave D entry (added later in the same release) shipped that
exact correction. Update the Wave B and Wave C variance subsections to
note explicitly that they are "Superseded by the Wave D Gardner GMM
first-stage correction in this same release", cross-referencing the
Wave D bullet.

Also: docs/api/spillover.rst documented the HC2/HC2_BM
NotImplementedError but omitted the new vcov_type="classical"
restriction. Add a classical bullet alongside the HC2/HC2_BM one with
the documented synthesis pointer to hc1 / conley / cluster.

Docs-only change; no source / test edits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 82502a4e2cff9900d6f3790e001ea9560bdd0588


Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

Executive Summary

  • Affected method: SpilloverDiD Wave D stage-2 variance, i.e. the Gardner GMM first-stage correction across HC1 / Conley / cluster. Cross-checking the implementation against the Methodology Registry and the new helper docstrings did not reveal an unmitigated methodology mismatch; the IF formula, HC1/CR1 small-sample multipliers, Conley no-multiplier choice, and vcov_type="classical" rejection all match the documented contract (docs/methodology/REGISTRY.md:L3048-L3097, diff_diff/two_stage.py:L56-L290, diff_diff/spillover.py:L2098-L2226, diff_diff/conley.py:L768-L1056).
  • The prior re-review docs/API issue is mostly resolved: docs/api/spillover.rst now documents both the shipped Wave D correction and the new vcov_type="classical" restriction (docs/api/spillover.rst:L177-L235).
  • Severity: P2. The new FE-design builder preserves pre-mask unit/time codes after finite_mask, so supported warn-and-drop fits can create all-zero FE columns and force the GMM stage onto the dense XtX_10.toarray() fallback; that is not a correctness problem on small panels, but it can become a large-memory path on bigger ones (diff_diff/spillover.py:L1500-L1564, diff_diff/spillover.py:L2719-L2751, diff_diff/two_stage.py:L165-L185).
  • Severity: P3. CHANGELOG.md still contradicts the new Wave C/Wave D status by listing event_study=True and the Gardner correction as deferred in the older Spillover bullet, and diff_diff/guides/llms-full.txt still frames shipped Wave C/D functionality under a “Wave B MVP — planned follow-ups” heading (CHANGELOG.md:L16-L17, diff_diff/guides/llms-full.txt:L501-L506).
  • I could not execute the test suite here because the available python lacks numpy and pytest; this review is by code inspection.

Methodology

No findings. The Wave D variance path matches the documented Butts/Gardner/Conley synthesis, including the explicitly documented deviations for Conley finite-sample adjustment and vcov_type="classical" rejection (docs/methodology/REGISTRY.md:L3048-L3097, diff_diff/two_stage.py:L56-L290, diff_diff/spillover.py:L2098-L2226).

Code Quality

No findings.

Performance

  • Severity: P2. Impact: after finite_mask, unit_codes_fit / time_codes_fit are sliced but not re-factorized before _build_butts_fe_design_csr(). If a supported warn-and-drop fit removes an interior unit code, the helper materializes an all-zero FE column; that makes X_10'X_10 exactly singular and deterministically sends _compute_gmm_corrected_meat() into the dense np.linalg.lstsq(XtX_10.toarray(), ...) fallback. On larger panels this can turn a supported fit into a large dense allocation or OOM even though the remaining estimation sample is identified. Concrete fix: compact the post-mask unit/time codes before building X_1 / X_10 (or have _build_butts_fe_design_csr() drop unused codes/columns internally). Refs: diff_diff/spillover.py:L1500-L1564, diff_diff/spillover.py:L2719-L2751, diff_diff/two_stage.py:L165-L185.

Maintainability

No findings.

Tech Debt

No findings. The prior Wave D follow-up row is correctly removed from TODO.md, and I did not see new untracked correctness debt in the diff.

Security

No findings.

Documentation/Tests

  • Severity: P3. Impact: the API doc is updated, but the release notes are still internally inconsistent because the older SpilloverDiD changelog bullet continues to list event_study=True and the Gardner GMM correction as deferred features immediately after the new Wave C/Wave D bullets say they shipped. The internal LLM guide also still labels shipped Wave C/D behavior as “Wave B MVP — planned follow-ups.” Concrete fix: remove or annotate the superseded deferred-feature text in CHANGELOG.md, and retitle/split the corresponding section in diff_diff/guides/llms-full.txt so only genuinely pending items remain under follow-ups. Refs: CHANGELOG.md:L16-L17, diff_diff/guides/llms-full.txt:L501-L506.
  • Test execution: not run in this environment because the available python lacks numpy and pytest.

…ollow-up lists

P2 Performance: when callers pass pre-mask integer codes that have had
interior values dropped via finite_mask (a supported warn-and-drop fit),
the code arrays are sparse — e.g. unit_codes = [0, 1, 3, 4] with code 2
dropped. Building X_10 on the raw codes materialized an all-zero FE
column at index 2, forcing sparse_factorized onto the dense
lstsq/XtX_10.toarray() fallback unnecessarily (large-memory path on big
panels). Fix: re-factorize via pd.factorize at the top of
_build_butts_fe_design_csr to compact the code space to 0..n_unique-1
(no-op when codes are already contiguous). Mirrors the column-space
convention of TwoStageDiD._build_fe_design.

P3 Docs: reconcile two stale follow-up lists where shipped Wave C/D
items still appeared as "planned":

- CHANGELOG Wave B bullet listed event_study=True and Gardner GMM
  correction under "Deferred features (planned follow-ups)" alongside
  genuinely-pending items (covariates, survey_design, ring_method,
  d_bar selection, sparse path). Add "as of Wave B ship-time" qualifier
  + "Shipped in same release: ..." note pointing at the Wave C/D
  bullets above.
- llms-full.txt heading "Restrictions (Wave B MVP — planned
  follow-ups)" was misleading since items below included a mix of
  shipped (event_study, GMM correction) and pending features. Retitle
  to "Restrictions and Wave C/D status" and add the vcov_type=classical
  Wave D restriction bullet alongside the existing covariates /
  survey_design restrictions.

All 224 spillover tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 4bec418a16f47927da861a9df37a5bd2eb61de9e


Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

Executive Summary

  • Affected method: SpilloverDiD Wave D stage-2 variance on both event_study=False and event_study=True.
  • Cross-checking the implementation against the Methodology Registry and the existing TwoStageDiD GMM reference path, I did not find an undocumented methodology deviation. The IF formula, HC1/CR1 multipliers, Conley reuse, and the vcov_type="classical" restriction all match the documented contract. Refs: docs/methodology/REGISTRY.md:L3048-L3097, diff_diff/two_stage.py:L56-L290, diff_diff/spillover.py:L2221-L2242, diff_diff/spillover.py:L2787-L2823, diff_diff/conley.py:L768-L992
  • The prior P2 performance concern appears addressed: post-finite_mask unit/time codes are compacted before FE-design construction, which closes the unnecessary dense fallback path noted in the last review. Refs: diff_diff/spillover.py:L1537-L1551, diff_diff/spillover.py:L2760-L2767
  • The prior P3 documentation inconsistency is addressed in the changelog, API docs, and bundled LLM guide. Refs: CHANGELOG.md:L12-L16, docs/api/spillover.rst:L171-L235, diff_diff/guides/llms-full.txt:L501-L507
  • The new tests cover the public variance contract in the places that mattered most from the earlier reround: Conley wiring, CR1 routing, single-cluster rejection, vcov_type="classical" rejection, NaN propagation, and hand-derived Psi invariants. Refs: tests/test_spillover.py:L872-L923, tests/test_spillover.py:L4112-L4515
  • Test execution was not run here because the available python lacks numpy and pytest.

Methodology

  • Severity: none. Impact: no unmitigated methodology defect found. The affected method is SpilloverDiD Wave D variance; the implementation matches the registry’s documented Butts §3.1 + Gardner §4 + Conley synthesis, and the only deviations I saw are explicitly documented (vcov_type="classical" rejection and no Conley finite-sample multiplier). Concrete fix: none. Refs: docs/methodology/REGISTRY.md:L3048-L3097, diff_diff/two_stage.py:L56-L290, diff_diff/spillover.py:L2221-L2242, diff_diff/spillover.py:L2787-L2823

Code Quality

  • Severity: none. Impact: the new path preserves the project’s NaN/inference contract. Rank-deficient stage-2 drops are handled by subsetting X_2 before Psi construction, re-inflating the vcov with NaN slots, and routing inference through safe_inference() rather than inline t-stat logic. Concrete fix: none. Refs: diff_diff/spillover.py:L2746-L2852, diff_diff/spillover.py:L2905-L2929, diff_diff/utils.py:L177-L211

Performance

  • Severity: none. Impact: the earlier dense XtX_10.toarray() fallback risk on warn-and-drop fits appears resolved by compacting post-mask unit/time codes before building the sparse FE design. Concrete fix: none. Refs: diff_diff/spillover.py:L1537-L1551, diff_diff/spillover.py:L2760-L2767

Maintainability

  • Severity: none. Impact: factoring _compute_conley_meat() out of _compute_conley_vcov() and isolating _compute_gmm_corrected_meat() reduces duplication and keeps the Conley kernel application centralized across the standard and Wave D sandwich paths. Concrete fix: none. Refs: diff_diff/conley.py:L768-L1040, diff_diff/two_stage.py:L56-L290

Tech Debt

  • Severity: none. Impact: I did not see new untracked correctness debt in scope. The old Gardner follow-up row is gone, and the remaining spillover follow-ups are still explicitly tracked in TODO.md. Concrete fix: none. Refs: TODO.md:L128-L136

Security

  • Severity: none. Impact: no new secret exposure, unsafe command execution, or obvious data-handling/security issue stood out in the changed files. Concrete fix: none.

Documentation/Tests

  • Severity: none. Impact: the prior stale-status docs issue is addressed, and the new tests cover the public Wave D variance contract well enough for this reround. Concrete fix: none. Refs: CHANGELOG.md:L12-L16, docs/api/spillover.rst:L171-L235, diff_diff/guides/llms-full.txt:L501-L507, tests/test_spillover.py:L872-L923, tests/test_spillover.py:L4112-L4515
  • Residual risk: I could not execute the test suite here because python in this environment does not have numpy or pytest.

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 17, 2026
@igerber igerber merged commit 1dc6a59 into main May 17, 2026
33 of 34 checks passed
@igerber igerber deleted the spillover-conley-wave-d-gmm-correction branch May 17, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant