From fb82c5d58dc415c74d5a72ad5ca35d582d912bf6 Mon Sep 17 00:00:00 2001 From: igerber Date: Tue, 19 May 2026 09:01:43 -0400 Subject: [PATCH 1/8] twfe: lift HC2/HC2-BM via inline full-dummy auto-route MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the unconditional NotImplementedError at twfe.py for `vcov_type in {"hc2","hc2_bm"}` with an inline full-dummy branch. TWFE has no absorb=/fixed_effects= parameter to swap (unit + time FE are baked into the estimator's identity), so the auto-route trick used for DiD-absorb / MPD-absorb doesn't apply directly. Instead, `TwoWayFixedEffects.fit()` bypasses the within-transform on hc2/hc2_bm and stacks [intercept, treated×post, covariates, factor(unit), factor(time)] explicitly so the leverage correction and BM DOF compute on the full FE projection (FWL preserves coefficients and residuals but NOT the hat matrix). **Auto-cluster default:** preserved on hc2_bm (routes to CR2-BM at unit) and on hc2 + wild_bootstrap; dropped on explicit hc2 + analytical to match the one-way contract (the linalg validator rejects hc2 + cluster_ids). **Surface change disclosure** (matches DiD-absorb / MPD-absorb): under vcov_type in {"hc2","hc2_bm"}, result.coefficients, result.vcov, result.residuals, result.fitted_values, and result.r_squared reflect the full-dummy fit. FE-dummy entries are included alongside the "ATT" key (len(coefficients) == vcov.shape[0] invariant upheld). result.att, its SE, and analytical inference are unchanged (FWL-equivalent). **Rejected combos:** vcov_type in {"hc2","hc2_bm"} + replicate- weight survey designs raises NotImplementedError because the replicate path re-demeans per replicate, which doesn't compose with the full-dummy build. Survey variance precedence: any resolved SurveyDesign drives variance via TSL/replicate (matches existing DiD/MPD contract), not the analytical small-sample sandwich. Verified at atol=1e-10 vs `lm() + sandwich::vcovHC(type="HC2")` and `lm() + clubSandwich::vcovCR(cluster=seq_len(n), type="CR2") + coef_test()$df_Satt` on a new `twfe_two_period` scenario in benchmarks/data/clubsandwich_cr2_golden.json. New tests: - TestFitBehavior (10 behavioral tests including refactor regression vs DiD(fixed_effects=[unit, time]) at atol=1e-12, auto-cluster distinguishability check vs one-way HC2-BM at 1% gap, replicate- weight rejection, coefficients-vs-vcov alignment invariant) - TestTWFEHC2RParity (3 R-parity tests at atol=1e-10) Lifts Gate 1 of the six HC2/HC2-BM NotImplementedError gates — the last absorbed-FE gate. Remaining gates: weighted one-way HC2-BM, weighted CR2-BM (both blocked on the clubSandwich WLS algebra derivation). Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + TODO.md | 4 +- benchmarks/R/generate_clubsandwich_golden.R | 51 ++ benchmarks/data/clubsandwich_cr2_golden.json | 18 +- diff_diff/twfe.py | 237 +++++---- docs/methodology/REGISTRY.md | 4 +- tests/test_estimators_vcov_type.py | 301 ++++++++++- tests/test_methodology_twfe.py | 506 +++++++++++++------ 8 files changed, 863 insertions(+), 259 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 8dc9fb16..d2be9da0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Added +- **`TwoWayFixedEffects(vcov_type in {"hc2","hc2_bm"})` now supported** (`diff_diff/twfe.py:155`). Lifts Gate 1 of the six HC2/HC2-BM `NotImplementedError` gates — the last absorbed-FE gate (DiD-absorb shipped earlier, MPD-absorb shipped earlier, MPD cluster+contrast-DOF shipped earlier in this release). Unlike DiD / MPD, TWFE has no `absorb=` / `fixed_effects=` parameter to swap (unit + time FEs are baked into the estimator's identity), so the same auto-route trick isn't applicable. Instead, `TwoWayFixedEffects.fit()` bypasses the within-transform when `vcov_type in {"hc2","hc2_bm"}` and stacks the full-dummy design `[intercept, treated×post, covariates, factor(unit), factor(time)]` explicitly, then runs OLS through the standard `solve_ols` path so the leverage correction `h_ii = x_i' (X'X)^{-1} x_i` and CR2 Bell-McCaffrey adjustment `A_g = (I - H_gg)^{-1/2}` compute on the full FE projection (FWL preserves coefficients and residuals but NOT the hat matrix). Verified at `atol=1e-10` vs `lm(y ~ treat_post + factor(unit) + factor(post)) + sandwich::vcovHC(type="HC2")` for HC2, vs `clubSandwich::vcovCR(cluster=seq_len(n), type="CR2") + coef_test()$df_Satt` for the singleton-cluster one-way HC2-BM Satterthwaite DOF, and vs `vcovCR(cluster=unit, type="CR2")` for the auto-cluster CR2-BM path (new `twfe_two_period` scenario in `benchmarks/data/clubsandwich_cr2_golden.json`). **Auto-cluster default:** TWFE's unit auto-cluster is preserved on `hc2_bm` (routes to CR2-BM at unit) and on `hc2 + wild_bootstrap` (the bootstrap consumes the cluster structure for resampling regardless of the analytical sandwich choice); dropped on explicit `hc2 + analytical` to match the one-way contract (the linalg validator rejects `hc2 + cluster_ids`). **User-visible surface change** (matches the DiD-absorb / MPD-absorb disclosures above): under `vcov_type in {"hc2","hc2_bm"}`, `result.coefficients`, `result.vcov`, `result.residuals`, `result.fitted_values`, and `result.r_squared` reflect the full-dummy fit rather than the within-transformed reduced fit (FE-dummy entries are included alongside the `"ATT"` key; `r_squared` is computed on the un-demeaned outcome; residuals / fitted are on the original scale; `len(result.coefficients) == result.vcov.shape[0]` invariant upheld). `result.att`, its SE, and analytical inference are unchanged (FWL-equivalent). HC1 / CR1 / Conley / classical paths remain on the within-transform. **Survey-design scope** (mirrors DiD-absorb): when `survey_design=` is supplied, the existing survey variance path (Taylor-series linearization or replicate-weight variance) takes precedence over the analytical HC2/HC2-BM sandwich; the full-dummy build only changes FE handling. **Rejected combos:** `vcov_type in {"hc2","hc2_bm"}` + replicate-weight survey designs (BRR / Fay / JK1 / JKn / SDR) raises `NotImplementedError` at `twfe.py:~233` because the replicate path re-demeans per replicate, which doesn't compose with the full-dummy build (would require per-replicate full-dummy refit); workaround: use `vcov_type="hc1"` for replicate-weight CR1. `hc2_bm + weights` remains blocked at the linalg validator (same gate as Gates 4-5 — weighted CR2 variants). New tests: `tests/test_estimators_vcov_type.py::TestFitBehavior` (9 tests: rejection flip → behavioral; refactor regression vs `DifferenceInDifferences(fixed_effects=[unit, time])` at `atol=1e-12`; auto-cluster default coverage on `hc2_bm`; explicit `hc2 + analytical` no-auto-cluster; `hc2 + wild_bootstrap` auto-cluster preserved; `hc2 / hc2_bm + replicate` rejection; always-treated unit finite ATT; coefficients-vs-vcov alignment invariant); `tests/test_methodology_twfe.py::TestTWFEHC2RParity` (3 R-parity tests at `atol=1e-10`). - **Agent-discoverability contract test (`tests/test_agent_discoverability.py`).** New static-snapshot test pinning the agent-facing surface introduced by PR #464: `__all__` membership of `agent_workflow` / `profile_panel` / `get_llm_guide` / `practitioner_next_steps` / `BusinessReport`; `dir(diff_diff)` head-first ordering against `_AGENT_FACING_ORDER` (catches drift in the `_OrderedName` `__lt__` ordering trick); `_OrderedName` `isinstance(_, str)` + str-method compatibility; `dir()` full-namespace + `inspect.getmembers` parity; top-level `__doc__` first-paragraph mention of `agent_workflow` + named references to the 5-step workflow primitives; `agent_workflow()` script content references each downstream helper by name; canonical estimator class names (CallawaySantAnna, ContinuousDiD, HeterogeneousAdoptionDiD, etc.) remain importable. No live API calls; runs in the default pytest suite. Closes [issue #461](https://github.com/igerber/diff-diff/issues/461) (snapshot variant — live-agent regression test deferred to a separate follow-up that depends on causal-llm-eval packaging its harness). Also closes the `__dir__()` contract-test row from `TODO.md` that PR #464 deferred here. - **`diff_diff.agent_workflow(df, unit=..., time=..., treatment=..., outcome=...)` — stateless orchestrator for LLM-agent discoverability** (`diff_diff/agent_workflow.py`). Prints (and returns as dict) a copy-pasteable 5-step workflow with the caller's column names templated in: `profile_panel` → `get_llm_guide("autonomous")` → `(...).fit(df, ...)` → `practitioner_next_steps(result)` → `BusinessReport(result).full_report()`. The function calls nothing internally and does not inspect `df`; it is a guided tour, not a router. Surfaces the canonical workflow primitives (`profile_panel`, `get_llm_guide`, `practitioner_next_steps`, `BusinessReport`) that cold-start agent dry-passes at [igerber/causal-llm-eval](https://github.com/igerber/causal-llm-eval) showed agents practically never reach for on their own. Output structure: `{"profile_call", "guide_call", "fit_candidates", "validation_calls", "reporting_call", "script"}`; `fit_candidates` is a flat list of estimator/diagnostic class names referenced in the workflow patterns (each must remain importable on `diff_diff`, locked by `tests/test_agent_workflow.py::test_fit_candidates_all_importable`). Closes [issue #460](https://github.com/igerber/diff-diff/issues/460). - **Top-level `__doc__` rewritten to lead with the agent workflow** (`diff_diff/__init__.py`). `help(diff_diff)` now opens with the `agent_workflow(df, ...)` recommendation as the first non-blank paragraph; `get_llm_guide("full")` and `get_llm_guide("practitioner")` pointers preserved for the existing `tests/test_guides.py::test_module_docstring_mentions_helper` guard. diff --git a/TODO.md b/TODO.md index 83f35525..e11b2a73 100644 --- a/TODO.md +++ b/TODO.md @@ -99,8 +99,8 @@ Deferred items from PR reviews that were not addressed before merge. | Thread `vcov_type` (classical / hc1 / hc2 / hc2_bm) through the 8 standalone estimators that expose `cluster=`: `CallawaySantAnna`, `SunAbraham`, `ImputationDiD`, `TwoStageDiD`, `TripleDifference`, `StackedDiD`, `WooldridgeDiD`, `EfficientDiD`. Phase 1a added `vcov_type` to the `DifferenceInDifferences` inheritance chain only. | multiple | Phase 1a | Medium | | Weighted one-way Bell-McCaffrey (`vcov_type="hc2_bm"` + `weights`, no cluster) currently raises `NotImplementedError`. `_compute_bm_dof_from_contrasts` builds its hat matrix from the unscaled design via `X (X'WX)^{-1} X' W`, but `solve_ols` solves the WLS problem by transforming to `X* = sqrt(w) X`, so the correct symmetric idempotent residual-maker is `M* = I - sqrt(W) X (X'WX)^{-1} X' sqrt(W)`. Rederive the Satterthwaite `(tr G)^2 / tr(G^2)` ratio on the transformed design and add weighted parity tests before lifting the guard. | `linalg.py::_compute_bm_dof_from_contrasts`, `linalg.py::_validate_vcov_args` | Phase 1a | Medium | -| HC2 / HC2 + Bell-McCaffrey on absorbed-FE fits — REMAINING sub-gate: `TwoWayFixedEffects` (`twfe.py:154` rejects unconditionally). The DiD sub-gate and the MultiPeriodDiD sub-gate were both lifted via auto-route to `fixed_effects=` internally (DiD: PR #458, ~1e-10 vs clubSandwich; MPD: this release, ~1e-10 vs sandwich::vcovHC and clubSandwich::vcovCR). TWFE has no equivalent `fixed_effects=` code path (always within-transforms), so the same auto-route surgery is not directly applicable — lifting requires either building the full-dummy design inline or refactoring TWFE to delegate to DiD. Within-transformation preserves coefficients and residuals under FWL but not the hat matrix; HC1/CR1 are unaffected (no leverage term). | `twfe.py::fit` | follow-up | Medium | | Weighted CR2 Bell-McCaffrey cluster-robust (`vcov_type="hc2_bm"` + `cluster_ids` + `weights`) currently raises `NotImplementedError`. Weighted hat matrix and residual rebalancing need threading per clubSandwich WLS handling. | `linalg.py::_compute_cr2_bm` | Phase 1a | Medium | +| `TwoWayFixedEffects(vcov_type in {"hc2","hc2_bm"})` with replicate-weight survey designs raises `NotImplementedError` (`twfe.py:~233`). The replicate path re-demeans per replicate (re-demeaning depends on the per-replicate weight vector), which doesn't compose with the full-dummy HC2/HC2-BM build — a correct implementation would need per-replicate full-dummy refit. Workaround: use `vcov_type="hc1"` for replicate-weight CR1. | `twfe.py::fit` | follow-up | Low | | Unify Rust local-method `estimate_model` solver path to `solve_wls_svd` (the same SVD helper used by the global-method since PR #348) for sub-1e-14 bootstrap SE parity. Current local-method bootstrap parity test (`tests/test_rust_backend.py::TestTROPRustEdgeCaseParity::test_bootstrap_seed_reproducibility_local`) passes at `atol=1e-5` — the residual ~1e-7 gap is roundoff between Rust's `estimate_model` matrix factorization and numpy's `lstsq`, which accumulates differently across per-replicate bootstrap fits. Main-fit ATT parity is regime-dependent (`atol=1e-14` for `lambda_nn=inf`, `atol=1e-10` for finite `lambda_nn` — see `test_local_method_main_fit_parity`); the bootstrap gap is a same-solver-path roundoff concern and not a user-visible correctness bug. | `rust/src/trop.rs::estimate_model`, `rust/src/linalg.rs::solve_wls_svd` | follow-up | Low | | Rust multiplier-bootstrap weight RNG (`generate_bootstrap_weights_batch` in `rust/src/bootstrap.rs:9-10, 57-75`) uses `Xoshiro256PlusPlus::seed_from_u64(seed + i)` per row for Rademacher/Mammen/Webb generation. If any Python caller (SDID / efficient-DiD multiplier bootstrap) has a numpy-canonical equivalent, the two backends likely diverge under the same seed. Audit Python callers (`diff_diff/sdid.py`, `diff_diff/efficient_did_bootstrap.py`, `diff_diff/bootstrap_utils.py::generate_bootstrap_weights_batch_numpy`) for parity-test gaps. Same fix shape as TROP RNG parity (PR #354): pre-generate weights in Python via numpy and pass them to Rust through PyO3. | `rust/src/bootstrap.rs`, `diff_diff/bootstrap_utils.py` | follow-up | Medium | | `bias_corrected_local_linear`: extend golden parity to `kernel="triangular"` and `kernel="uniform"` (currently epa-only; all three kernels share `kernel_W` and the `lprobust` math, so parity is expected but not separately asserted). | `benchmarks/R/generate_nprobust_lprobust_golden.R`, `tests/test_bias_corrected_lprobust.py` | Phase 1c | Low | @@ -193,7 +193,7 @@ Ordered paydown view across the tables above. Tier A → D is by effort × risk, #### Tier C — Heavy / derivation required - HonestDiD Δ^RM ARP conditional/hybrid confidence sets (`honest_did.py`) -- Weighted one-way Bell-McCaffrey + weighted CR2 Bell-McCaffrey + HC2/CR2 on absorbed-FE (linalg derivations + R parity harness) (`linalg.py`, `estimators.py::DifferenceInDifferences.fit`, `estimators.py::MultiPeriodDiD.fit`, `twfe.py::fit`) +- Weighted one-way Bell-McCaffrey + weighted CR2 Bell-McCaffrey (linalg derivations + R parity harness) (`linalg.py::_compute_bm_dof_from_contrasts`, `linalg.py::_compute_cr2_bm`) - Multi-absorb weighted demeaning: alternating-projection iteration for N>1 absorb + weights (`estimators.py`) - ImputationDiD dense `(A0'A0).toarray()` OOM: alternative dense fallback or richer sparse strategy (`imputation.py:1531`) - HAD mass-point `vcov_type ∈ {hc2, hc2_bm}`: 2SLS-specific leverage derivation (`had.py::_fit_mass_point_2sls`) diff --git a/benchmarks/R/generate_clubsandwich_golden.R b/benchmarks/R/generate_clubsandwich_golden.R index 74aaf88d..a2dc1338 100644 --- a/benchmarks/R/generate_clubsandwich_golden.R +++ b/benchmarks/R/generate_clubsandwich_golden.R @@ -232,6 +232,57 @@ output$mpd_clustered_avg_att_dof <- list( n_post_periods = length(post_names) ) +# --- TwoWayFixedEffects HC2 / HC2-BM scenario (Gate 1 lift PR) --------------- +# Mirrors TwoWayFixedEffects(vcov_type in {"hc2","hc2_bm"}) on a 2-period +# panel (binary post indicator). TWFE's `time` parameter is the post +# indicator, so the FE design is factor(unit) + factor(post), NOT +# factor(period). HC2 SE pinned via sandwich::vcovHC; one-way HC2-BM DOF +# via the singleton-cluster CR2 trick (Pustejovsky-Tipton 2018 Section 3.3 +# — CR2 with cluster=seq_len(n) reduces to Imbens-Kolesar BM). CR2-BM +# clustered at unit pinned separately for the auto-cluster path. + +set.seed(20260518) +n_twfe_units <- 8 +n_twfe_periods <- 4 +twfe_treated_units <- c(1, 3, 5, 7) +twfe_post_start <- 3 +d_twfe <- expand.grid(unit = seq_len(n_twfe_units), + period = seq_len(n_twfe_periods)) +d_twfe$treated <- as.integer(d_twfe$unit %in% twfe_treated_units) +d_twfe$post <- as.integer(d_twfe$period >= twfe_post_start) +d_twfe$treat_post <- d_twfe$treated * d_twfe$post +twfe_alpha_unit <- rnorm(n_twfe_units, mean = 0, sd = 1) +twfe_gamma_time <- rnorm(n_twfe_periods, mean = 0, sd = 0.5) +d_twfe$y <- 1.0 + 0.7 * d_twfe$treat_post + + twfe_alpha_unit[d_twfe$unit] + + twfe_gamma_time[d_twfe$period] + + rnorm(nrow(d_twfe), sd = 0.4) +fit_twfe <- lm(y ~ treat_post + factor(unit) + factor(post), data = d_twfe) +vcov_twfe_hc2 <- sandwich::vcovHC(fit_twfe, type = "HC2") +# Singleton-cluster CR2 trick for one-way HC2-BM DOF. +vcov_twfe_cr2_one_way <- vcovCR(fit_twfe, cluster = seq_len(nrow(d_twfe)), + type = "CR2") +ct_twfe_one_way <- coef_test(fit_twfe, vcov = vcov_twfe_cr2_one_way) +# CR2-BM clustered at unit (the TWFE auto-cluster default). +vcov_twfe_cr2_unit <- vcovCR(fit_twfe, cluster = d_twfe$unit, type = "CR2") +ct_twfe_unit <- coef_test(fit_twfe, vcov = vcov_twfe_cr2_unit) +output$twfe_two_period <- list( + unit = d_twfe$unit, + period = d_twfe$period, + treated = d_twfe$treated, + post = d_twfe$post, + treat_post = d_twfe$treat_post, + y = d_twfe$y, + coef = as.numeric(coef(fit_twfe)), + coef_names = names(coef(fit_twfe)), + vcov_hc2 = as.numeric(vcov_twfe_hc2), + vcov_hc2_shape = dim(vcov_twfe_hc2), + vcov_cr2_one_way = as.numeric(vcov_twfe_cr2_one_way), + dof_bm_one_way = as.numeric(ct_twfe_one_way$df_Satt), + vcov_cr2_unit = as.numeric(vcov_twfe_cr2_unit), + dof_bm_unit = as.numeric(ct_twfe_unit$df_Satt) +) + output$meta <- list( source = "clubSandwich", clubSandwich_version = as.character(packageVersion("clubSandwich")), diff --git a/benchmarks/data/clubsandwich_cr2_golden.json b/benchmarks/data/clubsandwich_cr2_golden.json index 5e406154..539f5efb 100644 --- a/benchmarks/data/clubsandwich_cr2_golden.json +++ b/benchmarks/data/clubsandwich_cr2_golden.json @@ -82,11 +82,27 @@ "reference_period": 1, "n_post_periods": 3 }, + "twfe_two_period": { + "unit": [1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8], + "period": [1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4], + "treated": [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], + "post": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], + "treat_post": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], + "y": [2.650364154679115, 0.5250992296125498, 2.538325280436974, 1.803468688851707, 0.390388546319389, 0.9019986445013289, 0.8232587072837394, 0.8653454679955752, 2.08429332526299, 0.0638119452881384, 2.722749318503621, 0.8169670582027474, 0.2459517838623851, 0.6804813712532132, 0.8556684840526531, 1.398839980876758, 2.620848972130513, 0.09909203941612038, 2.832338679868128, 0.7704342845402335, 0.8560980445011976, 0.5425351511582146, 0.9299248903311188, 1.744787275814005, 3.087638383603313, -0.7232315532492211, 2.27084735211901, 0.7045197493403264, 0.9856491992396943, 0.2839259051193889, 0.6881762347785356, 0.1525997107657043], + "coef": [2.488253574158318, 0.6802339974809865, -2.279476294911592, -0.0197210511870492, -1.246821764944735, -1.991264315438316, -1.668433942170453, -1.78652912980747, -1.230276101315478, -0.4351687279596561], + "coef_names": ["(Intercept)", "treat_post", "factor(unit)2", "factor(unit)3", "factor(unit)4", "factor(unit)5", "factor(unit)6", "factor(unit)7", "factor(unit)8", "factor(post)1"], + "vcov_hc2": [0.03700174102669025, -0.01209920750084357, -0.03700174102669036, -0.034148276882185, -0.0370017410266902, -0.03090579095064369, -0.03700174102669018, -0.03160431930844195, -0.03700174102669018, -7.600244650009306e-18, -0.01209920750084357, 0.07671968417768542, 0.03015788065425394, 0.008722820785758987, 0.06270949142009662, 0.002237848922676195, 0.04028563730391647, 0.003634905638272913, -0.01045609013314296, -0.05718235232384992, -0.03700174102669031, 0.03015788065425393, 0.08384338199488832, 0.03414827688218509, 0.0570406314820595, 0.03090579095064373, 0.04582870442396941, 0.03160431930844203, 0.02045784070543967, -0.01805867315341034, -0.034148276882185, 0.008722820785758943, 0.03414827688218509, 0.05520728093577265, 0.03414827688218493, 0.02978686648930553, 0.03414827688218493, 0.02978686648930548, 0.03414827688218493, 3.26088479668859e-17, -0.03700174102669024, 0.06270949142009664, 0.05704063148205959, 0.034148276882185, 0.1194701811546974, 0.0309057909506437, 0.06210450980689071, 0.03160431930844199, 0.03673364608836099, -0.05061028391925308, -0.03090579095064366, 0.002237848922676207, 0.03090579095064374, 0.02978686648930551, 0.03090579095064364, 0.04312575858084388, 0.03090579095064361, 0.02978686648930553, 0.03090579095064363, 9.288215089373668e-18, -0.03700174102669024, 0.04028563730391648, 0.04582870442396951, 0.034148276882185, 0.06210450980689072, 0.03090579095064367, 0.0564599920123676, 0.03160431930844197, 0.02552171903027089, -0.02818642980307293, -0.03160431930844198, 0.003634905638272877, 0.03160431930844206, 0.0297868664893055, 0.03160431930844194, 0.02978686648930557, 0.03160431930844192, 0.03939002087733653, 0.03160431930844193, 1.566395541000208e-17, -0.03700174102669019, -0.01045609013314297, 0.02045784070543973, 0.034148276882185, 0.03673364608836095, 0.03090579095064368, 0.02552171903027083, 0.03160431930844194, 0.1340805551581075, 0.02255529763398656, 4.385790535446187e-19, -0.05718235232384991, -0.01805867315341033, -1.35757095501444e-17, -0.05061028391925306, 1.208677012866869e-17, -0.02818642980307294, -4.970499585804175e-17, 0.02255529763398653, 0.05718235232384995], + "vcov_hc2_shape": [10, 10], + "vcov_cr2_one_way": [0.03700174102669018, -0.01209920750084357, -0.03700174102669025, -0.0341482768821849, -0.03700174102669003, -0.03090579095064359, -0.03700174102669005, -0.03160431930844183, -0.03700174102669001, 4.444145962929492e-17, -0.01209920750084359, 0.07671968417768549, 0.03015788065425398, 0.00872282078575895, 0.06270949142009666, 0.002237848922676209, 0.04028563730391651, 0.003634905638272878, -0.01045609013314298, -0.05718235232384999, -0.03700174102669025, 0.03015788065425397, 0.08384338199488831, 0.03414827688218498, 0.05704063148205939, 0.03090579095064365, 0.04582870442396931, 0.03160431930844191, 0.02045784070543955, -0.01805867315341042, -0.03414827688218493, 0.008722820785758943, 0.03414827688218502, 0.05520728093577257, 0.0341482768821848, 0.02978686648930545, 0.03414827688218481, 0.02978686648930537, 0.03414827688218482, -9.024515456557491e-18, -0.0370017410266902, 0.06270949142009669, 0.05704063148205952, 0.03414827688218489, 0.1194701811546973, 0.03090579095064362, 0.06210450980689061, 0.03160431930844186, 0.03673364608836086, -0.05061028391925316, -0.0309057909506436, 0.002237848922676207, 0.03090579095064367, 0.02978686648930542, 0.03090579095064351, 0.0431257585808438, 0.0309057909506435, 0.02978686648930542, 0.03090579095064351, -3.928404223797695e-17, -0.0370017410266902, 0.0402856373039165, 0.04582870442396944, 0.03414827688218489, 0.06210450980689061, 0.0309057909506436, 0.05645999201236749, 0.03160431930844185, 0.02552171903027075, -0.02818642980307301, -0.03160431930844192, 0.00363490563827287, 0.03160431930844198, 0.02978686648930541, 0.03160431930844181, 0.02978686648930549, 0.03160431930844181, 0.03939002087733642, 0.03160431930844181, -3.290830191734854e-17, -0.03700174102669015, -0.01045609013314295, 0.02045784070543967, 0.03414827688218488, 0.03673364608836084, 0.0309057909506436, 0.02552171903027073, 0.03160431930844181, 0.1340805551581074, 0.0225552976339865, 2.819415466917353e-17, -0.05718235232384999, -0.01805867315341039, 4.638886947612098e-18, -0.05061028391925316, -2.260769939086745e-17, -0.02818642980307301, -4.623554890608812e-17, 0.0225552976339865, 0.05718235232385], + "dof_bm_one_way": [3.425821064552667, 21.999999999999837, 6.851642129105291, 5.761904761904771, 6.851642129105294, 5.761904761904765, 6.851642129105291, 5.761904761904764, 6.85164212910529, 10.999999999999979], + "vcov_cr2_unit": [0.007651392098640002, -0.01530278419727998, -0.007651392098640009, -1.340972185906478e-17, -0.007651392098640004, -1.786362460978703e-17, -0.007651392098640002, -1.652260459528918e-17, -0.007651392098640006, 8.815980097977497e-19, -0.01530278419727999, 0.04018425503992974, 0.02009212751996491, 2.723300676932653e-17, 0.0200921275199649, 3.747012207819093e-17, 0.02009212751996489, 3.337015158794735e-17, 0.0200921275199649, -0.009578686645369807, -0.00765139209864001, 0.02009212751996491, 0.01004606375998247, 1.361650338466344e-17, 0.01004606375998247, 1.873506103909563e-17, 0.01004606375998246, 1.668507579397384e-17, 0.01004606375998247, -0.00478934332268491, -1.340972185906478e-17, 2.723300676932653e-17, 1.361650338466343e-17, 1.668199508743466e-31, 1.361650338466343e-17, 1.656912369884294e-31, 1.361650338466342e-17, 1.631551769724017e-31, 1.361650338466343e-17, -4.135630511972947e-19, -0.007651392098640005, 0.0200921275199649, 0.01004606375998247, 1.361650338466343e-17, 0.01004606375998247, 1.873506103909562e-17, 0.01004606375998246, 1.668507579397383e-17, 0.01004606375998247, -0.004789343322684908, -1.786362460978703e-17, 3.747012207819092e-17, 1.873506103909563e-17, 1.656912369884295e-31, 1.873506103909562e-17, 1.697016749718632e-31, 1.873506103909561e-17, 1.615308081491078e-31, 1.873506103909562e-17, -1.742872858617179e-18, -0.007651392098640003, 0.02009212751996489, 0.01004606375998247, 1.361650338466342e-17, 0.01004606375998246, 1.873506103909562e-17, 0.01004606375998246, 1.668507579397383e-17, 0.01004606375998246, -0.004789343322684904, -1.652260459528918e-17, 3.337015158794737e-17, 1.668507579397385e-17, 1.631551769724017e-31, 1.668507579397384e-17, 1.615308081491077e-31, 1.668507579397383e-17, 1.665183639542917e-31, 1.668507579397384e-17, -3.249423973693037e-19, -0.007651392098640007, 0.0200921275199649, 0.01004606375998247, 1.361650338466343e-17, 0.01004606375998247, 1.873506103909563e-17, 0.01004606375998246, 1.668507579397384e-17, 0.01004606375998246, -0.004789343322684903, 3.612539513184754e-18, -0.009578686645369814, -0.004789343322684914, -4.135630511972954e-19, -0.004789343322684916, -1.742872858617175e-18, -0.004789343322684907, -3.249423973692947e-19, -0.004789343322684909, 0.009578686645369807], + "dof_bm_unit": [3, 6.000000000000002, 5.999999999999998, 1.027080069278844, 6.000000000000001, 1.038147635656014, 6.000000000000003, 1.078234257225623, 5.999999999999999, 2.999999999999998] + }, "meta": { "source": "clubSandwich", "clubSandwich_version": "0.7.0", "R_version": "R version 4.5.2 (2025-10-31)", - "generated_at": "2026-05-18 01:50:55 UTC", + "generated_at": "2026-05-19 01:30:25 UTC", "note": "CR2 Bell-McCaffrey cluster-robust parity target for diff_diff._compute_cr2_bm" } } diff --git a/diff_diff/twfe.py b/diff_diff/twfe.py index 9f52d3dc..72a4c270 100644 --- a/diff_diff/twfe.py +++ b/diff_diff/twfe.py @@ -36,14 +36,16 @@ class TwoWayFixedEffects(DifferenceInDifferences): parameter passed to `fit()`). This differs from DifferenceInDifferences where cluster=None means no clustering. - **Exception:** when ``vcov_type="classical"`` and + **Exception (one-way analytical):** when + ``vcov_type in {"classical", "hc2"}`` is explicit AND ``inference="analytical"``, the unit auto-cluster is dropped - because the classical family is by construction one-way only and - the validator rejects ``cluster_ids + classical``. The user's - explicit choice of the classical family wins over the TWFE default - in that narrow analytical-inference case. Under - ``inference="wild_bootstrap"`` the auto-cluster is preserved (the - bootstrap uses the cluster structure to resample residuals). + because these families are by construction one-way only and the + validator rejects ``cluster_ids + classical`` / ``cluster_ids + + hc2``. The user's explicit one-way choice wins over the TWFE + default. Under ``inference="wild_bootstrap"`` the auto-cluster + is preserved regardless of ``vcov_type`` (the bootstrap uses the + cluster structure to resample residuals). On ``hc2_bm`` the + auto-cluster is also preserved (routes to CR2-BM at unit). alpha : float, default=0.05 Significance level for confidence intervals. @@ -55,17 +57,22 @@ class TwoWayFixedEffects(DifferenceInDifferences): where α_i are unit fixed effects and γ_t are time fixed effects. - **HC2 / Bell-McCaffrey are not available on TWFE.** Because TWFE uses - within-transformation (demeaning) to absorb the fixed effects, the - reduced design's hat matrix is not the full FE projection; HC2 leverage - and CR2 Bell-McCaffrey corrections on the demeaned design would produce - silently-wrong small-sample SEs (FWL preserves coefficients, not the - hat matrix). ``vcov_type in {"hc2","hc2_bm"}`` therefore raises - ``NotImplementedError`` with workarounds: use ``vcov_type="hc1"`` (HC1/ - CR1 survive FWL), or switch to ``DifferenceInDifferences(fixed_effects= - [...])`` where the dummies appear in the full design. Tracked in - ``TODO.md`` under Methodology/Correctness; also documented in - ``docs/methodology/REGISTRY.md``. + **HC2 / Bell-McCaffrey are supported via an internal full-dummy build.** + Because TWFE's within-transformation preserves coefficients but not the + hat matrix, HC2 leverage and CR2 Bell-McCaffrey corrections on the + demeaned design would produce wrong small-sample SEs. When + ``vcov_type in {"hc2","hc2_bm"}``, TWFE bypasses the within-transform + and builds the full-dummy design ``[intercept, treated×post, + covariates, unit_dummies, time_dummies]`` directly, so the leverage + correction and BM DOF compute on the full FE projection. Under this + path, ``result.coefficients``, ``result.vcov``, ``result.residuals``, + ``result.fitted_values``, and ``result.r_squared`` reflect the + full-dummy fit rather than the within-transformed reduced fit; the + ATT coefficient, its SE, and analytical inference are unchanged. + Auto-cluster-at-unit is preserved on ``hc2_bm`` (routes to CR2-BM at + unit) and on ``hc2`` + ``wild_bootstrap``; dropped on explicit ``hc2`` + + ``analytical`` to match the one-way contract. Documented in + ``docs/methodology/REGISTRY.md`` under the scope-limitation note. **Conley spatial-HAC (``vcov_type="conley"``) is supported via the block-decomposed panel sandwich (matches R ``conleyreg`` with @@ -137,34 +144,15 @@ def fit( # type: ignore[override] if unit not in data.columns: raise ValueError(f"Unit column '{unit}' not found in data") - # Reject HC2 / HC2 + Bell-McCaffrey on TWFE (and any absorbed-FE fit). - # TWFE demeans outcomes and regressors via within-transformation before - # solving OLS, and passes only the reduced (already-residualized) - # regressor matrix into ``LinearRegression``. The HC2 leverage - # correction ``h_ii = x_i' (X'X)^{-1} x_i`` and the CR2 Bell-McCaffrey - # adjustment matrix ``A_g = (I - H_gg)^{-1/2}`` both depend on the - # FULL fixed-effects hat matrix, not the residualized one: FWL - # preserves coefficients but NOT the hat matrix, so applying HC2 or - # CR2 to the demeaned design produces the wrong leverage and the - # wrong Bell-McCaffrey DOF. The correct fix (compute leverage from - # the full absorbed projection) is deferred to a follow-up PR; until - # then, reject fast rather than ship silently-wrong small-sample SEs. - # HC1 and CR1 are unaffected (no leverage term, meat uses only the - # residuals which FWL preserves). Tracked in TODO.md. - if self.vcov_type in ("hc2", "hc2_bm"): - raise NotImplementedError( - f"TwoWayFixedEffects(vcov_type={self.vcov_type!r}) is not " - "yet supported: TWFE uses within-transformation (demeaning) " - "before OLS, and the HC2 leverage correction / CR2 Bell-" - "McCaffrey DOF depend on the full FE hat matrix, not the " - "residualized one (FWL preserves coefficients but not " - "leverage). Applying HC2/CR2-BM to the demeaned design " - "would produce silently-wrong small-sample inference. Use " - "vcov_type='hc1' (HC1/CR1 preserve correctly under FWL), or " - "switch to fixed_effects= dummies on DifferenceInDifferences " - "for a full-dummy design where HC2/CR2-BM are computed on " - "the full projection." - ) + # HC2 / HC2 Bell-McCaffrey are now SUPPORTED via the inline + # full-dummy build below (see "use_full_dummy" branch around the + # design-construction block). FWL preserves coefficients and + # residuals but NOT the hat matrix, so HC2 leverage and CR2-BM + # DOF must compute on the full FE projection; building the design + # with explicit unit + time dummies routes through ``solve_ols``'s + # full-design hat matrix. HC1/CR1 paths remain on the demeaned + # design (no leverage term). + use_full_dummy = self.vcov_type in ("hc2", "hc2_bm") # Phase 2 panel block-decomposed Conley (matches R conleyreg). # FWL composability: the within-transformed scores S = X_demeaned * @@ -230,69 +218,127 @@ def fit( # type: ignore[override] "survey designs. Replicate weights provide their own variance " "estimation." ) + # Replicate weights + HC2 / HC2-BM is incompatible with the + # full-dummy auto-route: the replicate path re-demeans per + # replicate (re-demeaning depends on the per-replicate weight + # vector), which doesn't compose with the full-dummy design + # build. A correct implementation would need to re-build the + # full-dummy X per replicate and recompute the HC2 leverage, + # which is deferred. Mirrors the + # ``linalg.py::_validate_vcov_args`` ``hc2_bm + weights`` gate. + if _uses_replicate_twfe and self.vcov_type in ("hc2", "hc2_bm"): + raise NotImplementedError( + f"TwoWayFixedEffects(vcov_type={self.vcov_type!r}) with " + "replicate-weight survey designs is not yet supported: the " + "replicate path re-demeans per replicate, which does not " + "compose with the full-dummy HC2/HC2-BM build (would need " + "per-replicate full-dummy refit). Use vcov_type='hc1' for " + "replicate-weight CR1, or drop to analytical inference." + ) # Unit-level clustering is the TWFE default when `cluster` is not - # explicitly provided. But the one-way ``classical`` family is by - # construction not cluster-robust and the validator in - # ``compute_robust_vcov`` rejects ``cluster_ids + vcov_type=="classical"``. - # When the user EXPLICITLY asks for ``classical`` analytical inference - # (via ``vcov_type="classical"``) and does NOT set ``cluster=``, - # honor that choice by disabling the auto-cluster. + # explicitly provided. But the one-way ``classical`` and ``hc2`` + # families are by construction not cluster-robust and the validator + # in ``compute_robust_vcov`` rejects + # ``cluster_ids + vcov_type in ("classical","hc2")``. When the user + # EXPLICITLY asks for one of these analytical-one-way families AND + # does NOT set ``cluster=``, honor that choice by disabling the + # auto-cluster. # # When ``"classical"`` is IMPLICIT (from the legacy alias # ``robust=False``), keep the unit auto-cluster so # ``_resolve_effective_vcov_type`` below can remap it to ``"hc1"`` # and preserve the historical CR1-at-unit behavior. Wild-bootstrap - # inference also keeps the unit auto-cluster regardless (bootstrap - # consumes cluster structure for resampling). ``hc2``/``hc2_bm`` - # don't reach this block — they are rejected above. + # inference also keeps the unit auto-cluster regardless of + # ``vcov_type`` (bootstrap consumes cluster structure for + # resampling, independent of the analytical sandwich). ``hc2_bm`` + # also keeps the auto-cluster (routes to CR2-BM at unit). if self.cluster is not None: cluster_var: Optional[str] = self.cluster elif ( - self.vcov_type == "classical" + self.vcov_type in ("classical", "hc2") and self._vcov_type_explicit and self.inference == "analytical" ): - # Explicit classical + analytical inference: drop the auto-cluster - # so the validator doesn't reject ``cluster_ids + classical``. + # Explicit one-way analytical vcov: drop the auto-cluster so + # the validator doesn't reject ``cluster_ids`` with these + # families. Wild-bootstrap is exempt because the bootstrap + # uses the cluster structure for resampling regardless of + # the analytical sandwich choice. cluster_var = None else: cluster_var = unit - # Create treatment × post interaction from raw data before demeaning. - # This must be within-transformed alongside the outcome and covariates - # so that the regression uses demeaned regressors (FWL theorem). + # Create treatment × post interaction from raw data. data = data.copy() data["_treatment_post"] = data[treatment] * data[time] - # Demean outcome, covariates, AND interaction in a single pass - all_vars = [outcome] + (covariates or []) + ["_treatment_post"] - data_demeaned = _within_transform_util( - data, - all_vars, - unit, - time, - suffix="_demeaned", - weights=survey_weights, - ) - - # Extract variables for regression - y = data_demeaned[f"{outcome}_demeaned"].values - X_list = [data_demeaned["_treatment_post_demeaned"].values] - - if covariates: - for cov in covariates: - X_list.append(data_demeaned[f"{cov}_demeaned"].values) - - X = np.column_stack([np.ones(len(y))] + X_list) - - # ATT is the coefficient on treatment_post (index 1) - att_idx = 1 - - # Degrees of freedom adjustment for fixed effects n_units = data[unit].nunique() n_times = data[time].nunique() - df_adjustment = n_units + n_times - 2 + + if use_full_dummy: + # HC2 / HC2-BM full-dummy build: bypass the within-transform + # and stack [intercept, treated×post, covariates, unit_dummies, + # time_dummies] explicitly. FWL preserves the ATT coefficient + # and residuals, but NOT the hat matrix — so the leverage + # correction `h_ii = x_i' (X'X)^{-1} x_i` and the CR2 Bell- + # McCaffrey adjustment matrix `A_g = (I - H_gg)^{-1/2}` must + # be computed on the full FE projection. Pivoted-QR rank + # detection in `solve_ols` cleanly drops any collinear FE + # dummies (e.g. an always-treated unit × treatment_post + # collinearity) without poisoning the ATT. + y = data[outcome].values.astype(np.float64) + cov_arrs = [data[c].values.astype(np.float64) for c in (covariates or [])] + unit_dummies_df = pd.get_dummies(data[unit], prefix=f"_fe_{unit}", drop_first=True) + time_dummies_df = pd.get_dummies(data[time], prefix=f"_fe_{time}", drop_first=True) + unit_dummies = unit_dummies_df.values.astype(np.float64) + time_dummies = time_dummies_df.values.astype(np.float64) + X = np.column_stack( + [np.ones(len(data)), data["_treatment_post"].values] + + cov_arrs + + [unit_dummies, time_dummies] + ) + # FEs are now in X explicitly; solve_ols's n - k accounting + # already subtracts them, so the extra unit + time DOF + # adjustment used on the within-transform path would + # double-count. + df_adjustment = 0 + # var_names parallels the X columns so the downstream + # `coefficients` dict can mirror the full-dummy vcov shape + # (matching the MPD invariant + # ``len(result.coefficients) == result.vcov.shape[0]``). + _twfe_var_names: Optional[List[str]] = ( + ["const", "ATT"] + + list(covariates or []) + + list(unit_dummies_df.columns) + + list(time_dummies_df.columns) + ) + else: + # Default within-transform path (HC1 / classical / Conley): + # demean outcome, covariates, AND interaction in a single pass + # so the regression uses demeaned regressors (FWL theorem). + all_vars = [outcome] + (covariates or []) + ["_treatment_post"] + data_demeaned = _within_transform_util( + data, + all_vars, + unit, + time, + suffix="_demeaned", + weights=survey_weights, + ) + y = data_demeaned[f"{outcome}_demeaned"].values + X_list = [data_demeaned["_treatment_post_demeaned"].values] + if covariates: + for cov in covariates: + X_list.append(data_demeaned[f"{cov}_demeaned"].values) + X = np.column_stack([np.ones(len(y))] + X_list) + df_adjustment = n_units + n_times - 2 + # Within-transform path: FE dummies are NOT in X (they're absorbed + # by demeaning). var_names cover the visible columns only. + _twfe_var_names = ["const", "ATT"] + list(covariates or []) + + # ATT is the coefficient on treatment_post (index 1) on both branches. + att_idx = 1 # Always use LinearRegression for initial fit (unified code path) # For wild bootstrap, we don't need cluster SEs from the initial fit. @@ -571,6 +617,21 @@ def _refit_twfe(w_r): else: _twfe_cluster_label = unit + # Build the coefficients dict mirroring the actual X columns. On the + # full-dummy path this surfaces the FE-dummy entries alongside the ATT; + # on the within-transform path it only carries the visible + # [const, ATT, covariates] columns. The "ATT" name at index 1 is + # preserved as the ATT key on both paths, so existing + # `result.coefficients["ATT"]` consumers continue to work. The + # invariant ``len(coefficients) == vcov.shape[0]`` is now upheld on the + # full-dummy path (matches the MPD absorb auto-route invariant + # checked at tests/test_estimators_vcov_type.py:1611). + coef_array = np.asarray(reg.coefficients_) + _coefficients_dict: dict = ( + {name: float(c) for name, c in zip(_twfe_var_names, coef_array)} + if _twfe_var_names is not None + else {"ATT": float(att)} + ) self.results_ = DiDResults( att=att, se=se, @@ -581,7 +642,7 @@ def _refit_twfe(w_r): n_treated=n_treated, n_control=n_control, alpha=self.alpha, - coefficients={"ATT": float(att)}, + coefficients=_coefficients_dict, vcov=vcov, residuals=residuals, fitted_values=fitted, diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md index 3efdcdbc..97a52af2 100644 --- a/docs/methodology/REGISTRY.md +++ b/docs/methodology/REGISTRY.md @@ -2559,9 +2559,9 @@ Shipped in `diff_diff/had_pretests.py` as `stute_joint_pretest()` (residuals-in - [x] Phase 1a: HC2 + Bell-McCaffrey DOF correction in `diff_diff/linalg.py` via `vcov_type="hc2_bm"` enum (both one-way and CR2 cluster-robust with Imbens-Kolesar / Pustejovsky-Tipton Satterthwaite DOF). Weighted cluster CR2 raises `NotImplementedError` and is tracked as Phase 2+ in `TODO.md`. - **Note (scope limitation on absorbed FE):** HC2 and HC2 + Bell-McCaffrey on within-transformed designs still depend on the FULL FE hat matrix because FWL preserves coefficients and residuals but NOT the hat matrix: `h_ii = x_i' (X'X)^{-1} x_i` on the reduced design is not the diagonal of the full FE projection, and CR2's block adjustment `A_g = (I - H_gg)^{-1/2}` likewise depends on the full cluster-block hat matrix. The status across the three estimators that previously rejected this combination: - **`DifferenceInDifferences(absorb=..., vcov_type in {"hc2","hc2_bm"})` — SUPPORTED (auto-route).** When the user pairs `absorb=` with HC2 / HC2-BM, `DiD.fit()` internally promotes the absorb columns to `fixed_effects=` so the existing full-dummy code path computes the algebraically correct vcov from the full FE projection. Verified at ~1e-10 vs `lm() + sandwich::vcovHC(type="HC2")` and `lm() + clubSandwich::vcovCR(cluster=..., type="CR2")` (singleton-cluster CR2 trick for one-way HC2-BM Satterthwaite DOF; PT2018 §3.3 unweighted CR2 algebra). **User-visible surface change**: under the auto-route, the entire `DiDResults` (coefficients, vcov, residuals, fitted_values, r_squared) reflect the full-dummy fit rather than the within-transformed fit — the FE-dummy entries are included in `result.coefficients` / `result.vcov`, `r_squared` is computed on the un-demeaned outcome, and `residuals` / `fitted_values` are on the original scale. `result.att` is unaffected (FWL-equivalent). HC1/CR1 paths on `absorb=` are unchanged (no leverage term). **Survey-design scope**: when `survey_design=` is supplied, the existing survey variance path (Taylor-series linearization / replicate weights) takes precedence over the analytical HC2/HC2-BM sandwich; the auto-route only changes the FE handling (removing the prior reject) and does not redirect to the analytical small-sample sandwich on survey fits. - - **`TwoWayFixedEffects(vcov_type in {"hc2","hc2_bm"})` — still rejects.** TWFE is a standalone class with no `fixed_effects=` equivalent path, so the same auto-route surgery used for DiD-absorb and MPD-absorb is not directly applicable; lifting requires building the full-dummy design inline or refactoring TWFE to delegate to DiD. Tracked as a follow-up in `TODO.md`. + - **`TwoWayFixedEffects(vcov_type in {"hc2","hc2_bm"})` — SUPPORTED (inline full-dummy build).** TWFE has no `absorb=` / `fixed_effects=` parameter (the unit + time FE are baked into the estimator's identity), so the same parameter-swap auto-route used for DiD-absorb / MPD-absorb is not directly applicable. Instead, `TwoWayFixedEffects.fit()` bypasses the within-transform when `vcov_type in {"hc2","hc2_bm"}` and builds the full-dummy design `[intercept, treated×post, covariates, factor(unit), factor(time)]` explicitly, then runs OLS through the standard `solve_ols` path so the leverage correction and BM DOF compute on the full FE projection. Verified at atol=1e-10 vs `lm(y ~ treat_post + factor(unit) + factor(post)) + sandwich::vcovHC(type="HC2")` for HC2 and vs `clubSandwich::vcovCR(cluster=seq_len(n), type="CR2")` for the singleton-cluster one-way HC2-BM Satterthwaite DOF; vs `vcovCR(cluster=unit, type="CR2")` for the auto-cluster CR2-BM path. **Auto-cluster default:** TWFE's unit auto-cluster is preserved on `hc2_bm` (routes to CR2-BM at unit) and on `hc2 + wild_bootstrap` (the bootstrap consumes the cluster structure for resampling regardless of the analytical sandwich choice); dropped on explicit `hc2 + analytical` to match the one-way contract (the linalg validator rejects `hc2 + cluster_ids`). `hc2_bm + analytical` with no explicit cluster yields the auto-cluster CR2-BM path. **User-visible surface change** (matches the DiD-absorb / MPD-absorb disclosure above): under HC2 / HC2-BM, `result.coefficients`, `result.vcov`, `result.residuals`, `result.fitted_values`, and `result.r_squared` reflect the full-dummy fit rather than the within-transformed reduced fit (FE-dummy entries are included, `r_squared` is computed on the un-demeaned outcome, residuals/fitted are on the original scale). `result.att`, its SE, and analytical inference are unchanged (FWL-equivalent). HC1 / CR1 / Conley / classical paths remain on the within-transform (no leverage term in those vcov families). **Survey-design scope** (mirrors the DiD-absorb auto-route contract above): when `survey_design=` is supplied, the existing survey variance path (Taylor-series linearization for analytical-weight designs, or replicate-weight variance for BRR/Fay/JK1/JKn/SDR) takes precedence over the analytical HC2/HC2-BM sandwich; the full-dummy build only changes the FE handling (removing the prior reject) and does not redirect to the analytical small-sample sandwich on survey fits. **Replicate-weight survey designs** are blocked at the estimator level: `vcov_type in {"hc2","hc2_bm"}` + replicate weights raises `NotImplementedError` because the replicate refit path re-demeans per replicate, which doesn't compose with the full-dummy build (would require per-replicate full-dummy refit); workaround: use `vcov_type="hc1"` for replicate-weight CR1. `hc2_bm + weights` remains rejected upstream by the linalg validator (same gate as Gates 4-5 — weighted CR2 variants). - **`MultiPeriodDiD(absorb=..., vcov_type in {"hc2","hc2_bm"})` — SUPPORTED (auto-route).** Same auto-route pattern as `DifferenceInDifferences`: `MultiPeriodDiD.fit()` internally promotes the absorb columns to `fixed_effects=` for HC2 / HC2-BM callers, so the existing full-dummy code path computes the algebraically correct vcov from the full FE projection on the event-study design (`treated + period_X dummies + treated:period_X interactions + factor(unit)`). Verified at ~1e-10 vs `lm() + sandwich::vcovHC(type="HC2")` and `lm() + clubSandwich::vcovCR(cluster=1:n, type="CR2")` on a 5-cohort × 5-period event-study fixture; the parity target is a per-period interaction `treated:period_X` because MPD requires the `treated` column to be a time-invariant ever-treated indicator, which lies in the span of the intercept and the post-auto-route unit FE dummies (under `pd.get_dummies(..., drop_first=True)` the dropped reference unit is implicit in the intercept, so the exact alias relation depends on the omitted FE category — it is NOT simply "the sum of treated-cohort unit dummies"). `solve_ols` drops one column from the collinear set under R-style rank-deficiency handling; in the shipped parity fixture (4 ever-treated cohorts of 5 units + 1 never-treated cohort of 5 units) it drops a unit dummy from the never-treated cohort (`unit_25`) and the `treated` main effect remains finite, but the specific column that gets NaN'd is pivot-order and dummy-coding dependent. Either way, the slope coefficients (`treated:period_X`) and the post-period-average `avg_att` are identified and invariant to which column was dropped. Same `MultiPeriodDiDResults` surface change as DiD: `vcov`, `residuals`, `fitted_values`, `r_squared`, and `coefficients` reflect the full-dummy fit, with `period_effects[t].effect` / `.se` / `.p_value` / `.conf_int` invariant by FWL. HC1/CR1 paths on `absorb=` are unchanged (no leverage term). Same survey-design scope as DiD: replicate-weight variance routes through the standard `compute_replicate_vcov` path on the fixed full-dummy design rather than the per-replicate refit branch (which targets the demeaning path); since the auto-routed design does not depend on replicate weights, no refit is needed. **Redundant time-FE skip:** when the routed (or directly-supplied) `fixed_effects` list contains the `time` column, MPD silently skips emitting `