Lift MultiPeriodDiD-absorb HC2/HC2-BM gate via auto-route#459
Merged
Conversation
…ixed_effects= Mirrors PR #458 (DiD-absorb auto-route) on MultiPeriodDiD: when absorb= is paired with vcov_type in {hc2, hc2_bm}, the fit promotes the absorb columns to fixed_effects= internally so the existing full-dummy-design code path computes the algebraically correct vcov on the event-study design (treated + period_X dummies + treated:period_X interactions + factor(unit)). Verified at ~1e-15 vs lm() + sandwich::vcovHC(type="HC2") and lm() + clubSandwich::vcovCR(cluster=1:n, type="CR2") on a new 5-cohort x 5-period mpd_absorbed_fe_did fixture. Includes three-guard reorder so the auto-route sits BETWEEN the absorb + fixed_effects mutual-exclusion check (above) and the multi-absorb + survey-weights reject (below), matching the DiD ordering. The survey-replicate absorb-refit branch at estimators.py:1689 is short- circuited under the auto-route (the standard compute_replicate_vcov path applies on the fixed full-dummy design; no per-replicate refit needed). Tests: new TestMPDAbsorbedFERParity class (7 tests) mirrors PR #458's TestDiDAbsorbedFERParity, pinning parity targets on per-period interaction coefficients (treated:period_4) to avoid the treated x unit collinearity baked into MPD's time-invariant ever-treated indicator. Existing test_multi_period_absorb_rejects_hc2_and_hc2_bm deleted. REGISTRY.md per-estimator status block updated (MPD moves REJECT -> SUPPORTED; TWFE remains the only REJECT case). TODO row 99 narrowed to TWFE-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex local review surfaced two findings on the MPD-absorb gate lift:
P3 (methodology accuracy): REGISTRY/CHANGELOG/test-class docstring
claimed the `treated` main-effect coefficient becomes NaN under
rank deficiency. Empirically false — in the shipped parity fixture
solve_ols drops a never-treated unit dummy (`unit_25`) and keeps
`treated` finite. The pivot order is data-dependent. Rewrite to
say one column from the collinear set is dropped under R-style
rank-deficiency handling; per-period interactions and avg_att are
identified and invariant to that choice.
P2 (test coverage): the new MPD test class missed two surfaces that
the DiD analogue covers:
1. Survey-weighted multi-absorb auto-route bypass of the
`len(absorb) > 1 + survey_weights` reject — exercised by new
`test_absorb_hc2_bm_survey_multi_absorb_auto_routes` with
parity assertion against the explicit `fixed_effects=` path
on both `period_effects[target]` and `avg_att`.
2. The MPD-specific `avg_att` (post-period-average) contrast did
not have a direct HC2-vs-HC2-BM inference pin. Added
`test_absorb_hc2_bm_avg_att_df_sensitive_inference` asserting
same avg_se but different avg_p_value / wider avg_conf_int
under HC2-BM (the contract guard the per-period
df_sensitive test cannot reach).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex R2 returned ✅ with two P3s. P3 (test coverage upgraded to actionable per feedback_test_coverage_gap_treat_as_actionable.md): the survey test added in R1 used aweight (generic survey-vcov path), but the CHANGELOG/REGISTRY claim specifically that the auto-route short-circuits the absorb-refit replicate-variance branch at estimators.py:1693. Added test_absorb_hc2_bm_replicate_weights_auto_routes using SurveyDesign(replicate_method="JK1", replicate_weights=[...]) that exercises the replicate path and pins SE parity vs the explicit fixed_effects= form on both period_effects[target_period] and avg_att. Passing at atol=1e-12 confirms the documented short-circuit works as claimed. P3 (doc accuracy): REGISTRY/CHANGELOG/test-class docstring described the `treated` alias as "exactly collinear with the sum of treated-cohort unit dummies". Under pd.get_dummies(drop_first=True) the exact alias depends on the omitted FE reference category (and the intercept), not just on the cohort-dummy sum. Rewrite to say `treated` lies in the span of the intercept plus the post-auto-route unit FE dummies; which specific nuisance column gets dropped is pivot-order and dummy-coding dependent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Overall assessmentExecutive summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
CI Codex review on PR #459 surfaced a P1 newly exposed by the auto-route: when MPD(absorb=["unit","period"]) auto-routes to fixed_effects=["unit", "period"], the existing fixed_effects= expansion loop adds `period_X` dummies via `pd.get_dummies(prefix="period")` that collide on name with the event-study period dummies MPD already builds for non-reference periods. The duplicate `var_names` entries silently collapse in `coef_dict = {name: coef for name, coef in zip(var_names, coefficients)}`, overwriting the real event-study coefficients with the rank-deficient NaN drops on the redundant FE block. Result: `len(coefficients) < vcov.shape[0]` and `coefficients["period_X"] = NaN` even though `period_effects[X]` (read by position) was correct. Bug was pre-existing on MPD's `fixed_effects=[<time_col>]` path; the auto-route just made it newly reachable via `absorb=`. Fix: in MPD's fixed_effects expansion at estimators.py:1604, skip entries where `fe == time` — MPD's design already absorbs the time dimension via non-reference period dummies, so the FE-block dummies would be perfectly redundant anyway (NaN'd by solve_ols, dropping nothing useful while corrupting the result surface). Empirical evidence: - Pre-fix: `MPD(absorb=["unit","period"])` -> len(coefs)=34, vcov.shape=(38,38), coefs["period_2"]=NaN. - Post-fix: same call -> len(coefs)=34, vcov.shape=(34,34), coefs["period_2"]=0.345 (finite, matches MPD's event-study fit). Tests: new `test_absorb_hc2_result_surface_invariants_multi_absorb` asserts `len(coefficients) == vcov.shape[0]`, no duplicate names, and finite event-study `period_X` on BOTH the auto-route and the explicit `fixed_effects=` paths (Codex P2: regression coverage for the result- surface contract on the newly reachable path). 11/11 MPD tests pass; 249/249 in the broader sweep (test_estimators.py / test_linalg_hc2_bm.py unchanged). REGISTRY/CHANGELOG: documented the time-FE skip rule for both auto-route and pre-existing `fixed_effects=[<time_col>]` invocations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment✅ Looks good Executive summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MultiPeriodDiD(absorb=..., vcov_type in {"hc2","hc2_bm"})NotImplementedErrorgate atestimators.py:1476via 5-LoC auto-route tofixed_effects=internally — mirrors PR DiD-absorb HC2/HC2-BM: auto-route to fixed_effects internally #458's DiD-absorb pattern; same algebraic justification (FWL preserves coefficients/residuals but not the hat matrix; HC2/CR2 leverage corrections need the full FE-dummy hat).estimators.py:1693is correctly short-circuited under the auto-route (standardcompute_replicate_vcovpath applies to the fixed full-dummy design; no per-replicate refit needed) — verified by new JK1 replicate-weights regression test.TestMPDAbsorbedFERParityclass (10 tests): bit-equality auto-route invariants (single-absorb, multi-absorb, HC2-BM), R-parity vssandwich::vcovHCandclubSandwich::vcovCR(1e-10), df-sensitive inference on bothperiod_effectsandavg_att, survey-weighted multi-absorb auto-route, JK1 replicate-weight regression, and mutual-exclusion preservation.fixed_effects=equivalent path).Methodology references (required if estimator / math changes)
R/CR-adjustments.R(unweighted CR2 algebra reference); PT2018 §3.3 singleton-cluster CR2 trick for one-way HC2-BM; sandwichvcovHCfor HC2 anchor. Seedocs/methodology/REGISTRY.md§ HC2 + Bell-McCaffrey scope-limitation block (now-updated MultiPeriodDiD status).lm(y ~ treated + period_X dummies + treated:period_X + factor(unit), data=d); HC2/HC2-BM SEs match R at ~1e-10 (smoke test ≤ 1e-15).Validation
tests/test_estimators_vcov_type.py::TestMPDAbsorbedFERParity(10 new tests); existingtest_multi_period_absorb_rejects_hc2_and_hc2_bmremoved (its gate is now lifted, and the new class covers the same single-absorb shape with the new contract).mpd_absorbed_fe_didscenario inbenchmarks/R/generate_clubsandwich_golden.R+ regeneratedbenchmarks/data/clubsandwich_cr2_golden.json— 5-cohort × 5-period event-study fixture (4 ever-treated + 1 never-treated cohort) with HC2 (sandwich) + HC2-BM singleton-cluster CR2 (clubSandwich) targets pinned ontreated_period_4.test_estimators_vcov_type.py(67),test_estimators.py(180),test_linalg_hc2_bm.py.Security / privacy
Generated with Claude Code