Skip to content
Merged
5 changes: 3 additions & 2 deletions CHANGELOG.md

Large diffs are not rendered by default.

5 changes: 3 additions & 2 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,9 @@ Deferred items from PR reviews that were not addressed before merge.

| Thread `vcov_type` (classical / hc1 / hc2 / hc2_bm) through the 8 standalone estimators that expose `cluster=`: `CallawaySantAnna`, `SunAbraham`, `ImputationDiD`, `TwoStageDiD`, `TripleDifference`, `StackedDiD`, `WooldridgeDiD`, `EfficientDiD`. Phase 1a added `vcov_type` to the `DifferenceInDifferences` inheritance chain only. | multiple | Phase 1a | Medium |
| Weighted one-way Bell-McCaffrey (`vcov_type="hc2_bm"` + `weights`, no cluster) currently raises `NotImplementedError`. `_compute_bm_dof_from_contrasts` builds its hat matrix from the unscaled design via `X (X'WX)^{-1} X' W`, but `solve_ols` solves the WLS problem by transforming to `X* = sqrt(w) X`, so the correct symmetric idempotent residual-maker is `M* = I - sqrt(W) X (X'WX)^{-1} X' sqrt(W)`. Rederive the Satterthwaite `(tr G)^2 / tr(G^2)` ratio on the transformed design and add weighted parity tests before lifting the guard. | `linalg.py::_compute_bm_dof_from_contrasts`, `linalg.py::_validate_vcov_args` | Phase 1a | Medium |
| HC2 / HC2 + Bell-McCaffrey on absorbed-FE fits — REMAINING sub-gate: `TwoWayFixedEffects` (`twfe.py:154` rejects unconditionally). The DiD sub-gate and the MultiPeriodDiD sub-gate were both lifted via auto-route to `fixed_effects=` internally (DiD: PR #458, ~1e-10 vs clubSandwich; MPD: this release, ~1e-10 vs sandwich::vcovHC and clubSandwich::vcovCR). TWFE has no equivalent `fixed_effects=` code path (always within-transforms), so the same auto-route surgery is not directly applicable — lifting requires either building the full-dummy design inline or refactoring TWFE to delegate to DiD. Within-transformation preserves coefficients and residuals under FWL but not the hat matrix; HC1/CR1 are unaffected (no leverage term). | `twfe.py::fit` | follow-up | Medium |
| Weighted CR2 Bell-McCaffrey cluster-robust (`vcov_type="hc2_bm"` + `cluster_ids` + `weights`) currently raises `NotImplementedError`. Weighted hat matrix and residual rebalancing need threading per clubSandwich WLS handling. | `linalg.py::_compute_cr2_bm` | Phase 1a | Medium |
| `TwoWayFixedEffects(vcov_type in {"hc2","hc2_bm"})` with replicate-weight survey designs raises `NotImplementedError` (`twfe.py:~233`). The replicate path re-demeans per replicate (re-demeaning depends on the per-replicate weight vector), which doesn't compose with the full-dummy HC2/HC2-BM build — a correct implementation would need per-replicate full-dummy refit. Workaround: use `vcov_type="hc1"` for replicate-weight CR1. | `twfe.py::fit` | follow-up | Low |
| TWFE's HC2/HC2-BM inline full-dummy build (`twfe.py:280-315`) duplicates the dummy-construction logic in `DifferenceInDifferences(fixed_effects=...)` (`estimators.py:478-486`). Extract a shared helper (or delegate TWFE's HC2/HC2-BM path to DiD's `fixed_effects=` branch, with TWFE-specific cluster default threading) to reduce drift risk on FE naming, survey behavior, and result-surface conventions. Substantive refactor — touches both estimators. | `twfe.py::fit`, `estimators.py::DifferenceInDifferences.fit` | follow-up | Low |
| Unify Rust local-method `estimate_model` solver path to `solve_wls_svd` (the same SVD helper used by the global-method since PR #348) for sub-1e-14 bootstrap SE parity. Current local-method bootstrap parity test (`tests/test_rust_backend.py::TestTROPRustEdgeCaseParity::test_bootstrap_seed_reproducibility_local`) passes at `atol=1e-5` — the residual ~1e-7 gap is roundoff between Rust's `estimate_model` matrix factorization and numpy's `lstsq`, which accumulates differently across per-replicate bootstrap fits. Main-fit ATT parity is regime-dependent (`atol=1e-14` for `lambda_nn=inf`, `atol=1e-10` for finite `lambda_nn` — see `test_local_method_main_fit_parity`); the bootstrap gap is a same-solver-path roundoff concern and not a user-visible correctness bug. | `rust/src/trop.rs::estimate_model`, `rust/src/linalg.rs::solve_wls_svd` | follow-up | Low |
| Rust multiplier-bootstrap weight RNG (`generate_bootstrap_weights_batch` in `rust/src/bootstrap.rs:9-10, 57-75`) uses `Xoshiro256PlusPlus::seed_from_u64(seed + i)` per row for Rademacher/Mammen/Webb generation. If any Python caller (SDID / efficient-DiD multiplier bootstrap) has a numpy-canonical equivalent, the two backends likely diverge under the same seed. Audit Python callers (`diff_diff/sdid.py`, `diff_diff/efficient_did_bootstrap.py`, `diff_diff/bootstrap_utils.py::generate_bootstrap_weights_batch_numpy`) for parity-test gaps. Same fix shape as TROP RNG parity (PR #354): pre-generate weights in Python via numpy and pass them to Rust through PyO3. | `rust/src/bootstrap.rs`, `diff_diff/bootstrap_utils.py` | follow-up | Medium |
| `bias_corrected_local_linear`: extend golden parity to `kernel="triangular"` and `kernel="uniform"` (currently epa-only; all three kernels share `kernel_W` and the `lprobust` math, so parity is expected but not separately asserted). | `benchmarks/R/generate_nprobust_lprobust_golden.R`, `tests/test_bias_corrected_lprobust.py` | Phase 1c | Low |
Expand Down Expand Up @@ -193,7 +194,7 @@ Ordered paydown view across the tables above. Tier A → D is by effort × risk,
#### Tier C — Heavy / derivation required

- HonestDiD Δ^RM ARP conditional/hybrid confidence sets (`honest_did.py`)
- Weighted one-way Bell-McCaffrey + weighted CR2 Bell-McCaffrey + HC2/CR2 on absorbed-FE (linalg derivations + R parity harness) (`linalg.py`, `estimators.py::DifferenceInDifferences.fit`, `estimators.py::MultiPeriodDiD.fit`, `twfe.py::fit`)
- Weighted one-way Bell-McCaffrey + weighted CR2 Bell-McCaffrey (linalg derivations + R parity harness) (`linalg.py::_compute_bm_dof_from_contrasts`, `linalg.py::_compute_cr2_bm`)
- Multi-absorb weighted demeaning: alternating-projection iteration for N>1 absorb + weights (`estimators.py`)
- ImputationDiD dense `(A0'A0).toarray()` OOM: alternative dense fallback or richer sparse strategy (`imputation.py:1531`)
- HAD mass-point `vcov_type ∈ {hc2, hc2_bm}`: 2SLS-specific leverage derivation (`had.py::_fit_mass_point_2sls`)
Expand Down
51 changes: 51 additions & 0 deletions benchmarks/R/generate_clubsandwich_golden.R
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,57 @@ output$mpd_clustered_avg_att_dof <- list(
n_post_periods = length(post_names)
)

# --- TwoWayFixedEffects HC2 / HC2-BM scenario (Gate 1 lift PR) ---------------
# Mirrors TwoWayFixedEffects(vcov_type in {"hc2","hc2_bm"}) on a 2-period
# panel (binary post indicator). TWFE's `time` parameter is the post
# indicator, so the FE design is factor(unit) + factor(post), NOT
# factor(period). HC2 SE pinned via sandwich::vcovHC; one-way HC2-BM DOF
# via the singleton-cluster CR2 trick (Pustejovsky-Tipton 2018 Section 3.3
# — CR2 with cluster=seq_len(n) reduces to Imbens-Kolesar BM). CR2-BM
# clustered at unit pinned separately for the auto-cluster path.

set.seed(20260518)
n_twfe_units <- 8
n_twfe_periods <- 4
twfe_treated_units <- c(1, 3, 5, 7)
twfe_post_start <- 3
d_twfe <- expand.grid(unit = seq_len(n_twfe_units),
period = seq_len(n_twfe_periods))
d_twfe$treated <- as.integer(d_twfe$unit %in% twfe_treated_units)
d_twfe$post <- as.integer(d_twfe$period >= twfe_post_start)
d_twfe$treat_post <- d_twfe$treated * d_twfe$post
twfe_alpha_unit <- rnorm(n_twfe_units, mean = 0, sd = 1)
twfe_gamma_time <- rnorm(n_twfe_periods, mean = 0, sd = 0.5)
d_twfe$y <- 1.0 + 0.7 * d_twfe$treat_post +
twfe_alpha_unit[d_twfe$unit] +
twfe_gamma_time[d_twfe$period] +
rnorm(nrow(d_twfe), sd = 0.4)
fit_twfe <- lm(y ~ treat_post + factor(unit) + factor(post), data = d_twfe)
vcov_twfe_hc2 <- sandwich::vcovHC(fit_twfe, type = "HC2")
# Singleton-cluster CR2 trick for one-way HC2-BM DOF.
vcov_twfe_cr2_one_way <- vcovCR(fit_twfe, cluster = seq_len(nrow(d_twfe)),
type = "CR2")
ct_twfe_one_way <- coef_test(fit_twfe, vcov = vcov_twfe_cr2_one_way)
# CR2-BM clustered at unit (the TWFE auto-cluster default).
vcov_twfe_cr2_unit <- vcovCR(fit_twfe, cluster = d_twfe$unit, type = "CR2")
ct_twfe_unit <- coef_test(fit_twfe, vcov = vcov_twfe_cr2_unit)
output$twfe_two_period <- list(
unit = d_twfe$unit,
period = d_twfe$period,
treated = d_twfe$treated,
post = d_twfe$post,
treat_post = d_twfe$treat_post,
y = d_twfe$y,
coef = as.numeric(coef(fit_twfe)),
coef_names = names(coef(fit_twfe)),
vcov_hc2 = as.numeric(vcov_twfe_hc2),
vcov_hc2_shape = dim(vcov_twfe_hc2),
vcov_cr2_one_way = as.numeric(vcov_twfe_cr2_one_way),
dof_bm_one_way = as.numeric(ct_twfe_one_way$df_Satt),
vcov_cr2_unit = as.numeric(vcov_twfe_cr2_unit),
dof_bm_unit = as.numeric(ct_twfe_unit$df_Satt)
)

output$meta <- list(
source = "clubSandwich",
clubSandwich_version = as.character(packageVersion("clubSandwich")),
Expand Down
18 changes: 17 additions & 1 deletion benchmarks/data/clubsandwich_cr2_golden.json
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,27 @@
"reference_period": 1,
"n_post_periods": 3
},
"twfe_two_period": {
"unit": [1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8],
"period": [1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4],
"treated": [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0],
"post": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
"treat_post": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0],
"y": [2.650364154679115, 0.5250992296125498, 2.538325280436974, 1.803468688851707, 0.390388546319389, 0.9019986445013289, 0.8232587072837394, 0.8653454679955752, 2.08429332526299, 0.0638119452881384, 2.722749318503621, 0.8169670582027474, 0.2459517838623851, 0.6804813712532132, 0.8556684840526531, 1.398839980876758, 2.620848972130513, 0.09909203941612038, 2.832338679868128, 0.7704342845402335, 0.8560980445011976, 0.5425351511582146, 0.9299248903311188, 1.744787275814005, 3.087638383603313, -0.7232315532492211, 2.27084735211901, 0.7045197493403264, 0.9856491992396943, 0.2839259051193889, 0.6881762347785356, 0.1525997107657043],
"coef": [2.488253574158318, 0.6802339974809865, -2.279476294911592, -0.0197210511870492, -1.246821764944735, -1.991264315438316, -1.668433942170453, -1.78652912980747, -1.230276101315478, -0.4351687279596561],
"coef_names": ["(Intercept)", "treat_post", "factor(unit)2", "factor(unit)3", "factor(unit)4", "factor(unit)5", "factor(unit)6", "factor(unit)7", "factor(unit)8", "factor(post)1"],
"vcov_hc2": [0.03700174102669025, -0.01209920750084357, -0.03700174102669036, -0.034148276882185, -0.0370017410266902, -0.03090579095064369, -0.03700174102669018, -0.03160431930844195, -0.03700174102669018, -7.600244650009306e-18, -0.01209920750084357, 0.07671968417768542, 0.03015788065425394, 0.008722820785758987, 0.06270949142009662, 0.002237848922676195, 0.04028563730391647, 0.003634905638272913, -0.01045609013314296, -0.05718235232384992, -0.03700174102669031, 0.03015788065425393, 0.08384338199488832, 0.03414827688218509, 0.0570406314820595, 0.03090579095064373, 0.04582870442396941, 0.03160431930844203, 0.02045784070543967, -0.01805867315341034, -0.034148276882185, 0.008722820785758943, 0.03414827688218509, 0.05520728093577265, 0.03414827688218493, 0.02978686648930553, 0.03414827688218493, 0.02978686648930548, 0.03414827688218493, 3.26088479668859e-17, -0.03700174102669024, 0.06270949142009664, 0.05704063148205959, 0.034148276882185, 0.1194701811546974, 0.0309057909506437, 0.06210450980689071, 0.03160431930844199, 0.03673364608836099, -0.05061028391925308, -0.03090579095064366, 0.002237848922676207, 0.03090579095064374, 0.02978686648930551, 0.03090579095064364, 0.04312575858084388, 0.03090579095064361, 0.02978686648930553, 0.03090579095064363, 9.288215089373668e-18, -0.03700174102669024, 0.04028563730391648, 0.04582870442396951, 0.034148276882185, 0.06210450980689072, 0.03090579095064367, 0.0564599920123676, 0.03160431930844197, 0.02552171903027089, -0.02818642980307293, -0.03160431930844198, 0.003634905638272877, 0.03160431930844206, 0.0297868664893055, 0.03160431930844194, 0.02978686648930557, 0.03160431930844192, 0.03939002087733653, 0.03160431930844193, 1.566395541000208e-17, -0.03700174102669019, -0.01045609013314297, 0.02045784070543973, 0.034148276882185, 0.03673364608836095, 0.03090579095064368, 0.02552171903027083, 0.03160431930844194, 0.1340805551581075, 0.02255529763398656, 4.385790535446187e-19, -0.05718235232384991, -0.01805867315341033, -1.35757095501444e-17, -0.05061028391925306, 1.208677012866869e-17, -0.02818642980307294, -4.970499585804175e-17, 0.02255529763398653, 0.05718235232384995],
"vcov_hc2_shape": [10, 10],
"vcov_cr2_one_way": [0.03700174102669018, -0.01209920750084357, -0.03700174102669025, -0.0341482768821849, -0.03700174102669003, -0.03090579095064359, -0.03700174102669005, -0.03160431930844183, -0.03700174102669001, 4.444145962929492e-17, -0.01209920750084359, 0.07671968417768549, 0.03015788065425398, 0.00872282078575895, 0.06270949142009666, 0.002237848922676209, 0.04028563730391651, 0.003634905638272878, -0.01045609013314298, -0.05718235232384999, -0.03700174102669025, 0.03015788065425397, 0.08384338199488831, 0.03414827688218498, 0.05704063148205939, 0.03090579095064365, 0.04582870442396931, 0.03160431930844191, 0.02045784070543955, -0.01805867315341042, -0.03414827688218493, 0.008722820785758943, 0.03414827688218502, 0.05520728093577257, 0.0341482768821848, 0.02978686648930545, 0.03414827688218481, 0.02978686648930537, 0.03414827688218482, -9.024515456557491e-18, -0.0370017410266902, 0.06270949142009669, 0.05704063148205952, 0.03414827688218489, 0.1194701811546973, 0.03090579095064362, 0.06210450980689061, 0.03160431930844186, 0.03673364608836086, -0.05061028391925316, -0.0309057909506436, 0.002237848922676207, 0.03090579095064367, 0.02978686648930542, 0.03090579095064351, 0.0431257585808438, 0.0309057909506435, 0.02978686648930542, 0.03090579095064351, -3.928404223797695e-17, -0.0370017410266902, 0.0402856373039165, 0.04582870442396944, 0.03414827688218489, 0.06210450980689061, 0.0309057909506436, 0.05645999201236749, 0.03160431930844185, 0.02552171903027075, -0.02818642980307301, -0.03160431930844192, 0.00363490563827287, 0.03160431930844198, 0.02978686648930541, 0.03160431930844181, 0.02978686648930549, 0.03160431930844181, 0.03939002087733642, 0.03160431930844181, -3.290830191734854e-17, -0.03700174102669015, -0.01045609013314295, 0.02045784070543967, 0.03414827688218488, 0.03673364608836084, 0.0309057909506436, 0.02552171903027073, 0.03160431930844181, 0.1340805551581074, 0.0225552976339865, 2.819415466917353e-17, -0.05718235232384999, -0.01805867315341039, 4.638886947612098e-18, -0.05061028391925316, -2.260769939086745e-17, -0.02818642980307301, -4.623554890608812e-17, 0.0225552976339865, 0.05718235232385],
"dof_bm_one_way": [3.425821064552667, 21.999999999999837, 6.851642129105291, 5.761904761904771, 6.851642129105294, 5.761904761904765, 6.851642129105291, 5.761904761904764, 6.85164212910529, 10.999999999999979],
"vcov_cr2_unit": [0.007651392098640002, -0.01530278419727998, -0.007651392098640009, -1.340972185906478e-17, -0.007651392098640004, -1.786362460978703e-17, -0.007651392098640002, -1.652260459528918e-17, -0.007651392098640006, 8.815980097977497e-19, -0.01530278419727999, 0.04018425503992974, 0.02009212751996491, 2.723300676932653e-17, 0.0200921275199649, 3.747012207819093e-17, 0.02009212751996489, 3.337015158794735e-17, 0.0200921275199649, -0.009578686645369807, -0.00765139209864001, 0.02009212751996491, 0.01004606375998247, 1.361650338466344e-17, 0.01004606375998247, 1.873506103909563e-17, 0.01004606375998246, 1.668507579397384e-17, 0.01004606375998247, -0.00478934332268491, -1.340972185906478e-17, 2.723300676932653e-17, 1.361650338466343e-17, 1.668199508743466e-31, 1.361650338466343e-17, 1.656912369884294e-31, 1.361650338466342e-17, 1.631551769724017e-31, 1.361650338466343e-17, -4.135630511972947e-19, -0.007651392098640005, 0.0200921275199649, 0.01004606375998247, 1.361650338466343e-17, 0.01004606375998247, 1.873506103909562e-17, 0.01004606375998246, 1.668507579397383e-17, 0.01004606375998247, -0.004789343322684908, -1.786362460978703e-17, 3.747012207819092e-17, 1.873506103909563e-17, 1.656912369884295e-31, 1.873506103909562e-17, 1.697016749718632e-31, 1.873506103909561e-17, 1.615308081491078e-31, 1.873506103909562e-17, -1.742872858617179e-18, -0.007651392098640003, 0.02009212751996489, 0.01004606375998247, 1.361650338466342e-17, 0.01004606375998246, 1.873506103909562e-17, 0.01004606375998246, 1.668507579397383e-17, 0.01004606375998246, -0.004789343322684904, -1.652260459528918e-17, 3.337015158794737e-17, 1.668507579397385e-17, 1.631551769724017e-31, 1.668507579397384e-17, 1.615308081491077e-31, 1.668507579397383e-17, 1.665183639542917e-31, 1.668507579397384e-17, -3.249423973693037e-19, -0.007651392098640007, 0.0200921275199649, 0.01004606375998247, 1.361650338466343e-17, 0.01004606375998247, 1.873506103909563e-17, 0.01004606375998246, 1.668507579397384e-17, 0.01004606375998246, -0.004789343322684903, 3.612539513184754e-18, -0.009578686645369814, -0.004789343322684914, -4.135630511972954e-19, -0.004789343322684916, -1.742872858617175e-18, -0.004789343322684907, -3.249423973692947e-19, -0.004789343322684909, 0.009578686645369807],
"dof_bm_unit": [3, 6.000000000000002, 5.999999999999998, 1.027080069278844, 6.000000000000001, 1.038147635656014, 6.000000000000003, 1.078234257225623, 5.999999999999999, 2.999999999999998]
},
"meta": {
"source": "clubSandwich",
"clubSandwich_version": "0.7.0",
"R_version": "R version 4.5.2 (2025-10-31)",
"generated_at": "2026-05-18 01:50:55 UTC",
"generated_at": "2026-05-19 01:30:25 UTC",
"note": "CR2 Bell-McCaffrey cluster-robust parity target for diff_diff._compute_cr2_bm"
}
}
Loading
Loading