diff --git a/TODO.md b/TODO.md
index 42ddd3a5..70eef04e 100644
--- a/TODO.md
+++ b/TODO.md
@@ -101,7 +101,6 @@ Deferred items from PR reviews that were not addressed before merge.
 | Weighted one-way Bell-McCaffrey (`vcov_type="hc2_bm"` + `weights`, no cluster) currently raises `NotImplementedError`. `_compute_bm_dof_from_contrasts` builds its hat matrix from the unscaled design via `X (X'WX)^{-1} X' W`, but `solve_ols` solves the WLS problem by transforming to `X* = sqrt(w) X`, so the correct symmetric idempotent residual-maker is `M* = I - sqrt(W) X (X'WX)^{-1} X' sqrt(W)`. Rederive the Satterthwaite `(tr G)^2 / tr(G^2)` ratio on the transformed design and add weighted parity tests before lifting the guard. | `linalg.py::_compute_bm_dof_from_contrasts`, `linalg.py::_validate_vcov_args` | Phase 1a | Medium |
 | HC2 / HC2 + Bell-McCaffrey on absorbed-FE fits currently raises `NotImplementedError` in three places: `TwoWayFixedEffects` unconditionally; `DifferenceInDifferences(absorb=..., vcov_type in {"hc2","hc2_bm"})`; `MultiPeriodDiD(absorb=..., vcov_type in {"hc2","hc2_bm"})`. Within-transformation preserves coefficients and residuals under FWL but not the hat matrix, so the reduced-design `h_ii` is not the diagonal of the full FE projection and CR2's block adjustment `A_g = (I - H_gg)^{-1/2}` is likewise wrong on absorbed cluster blocks. Lifting the guard needs HC2/CR2-BM computed from the full absorbed projection (unit/time FE dummies reconstructed internally, or a FE-aware hat-matrix formulation) and a parity harness against a full-dummy OLS run or R `fixest`/`clubSandwich`. HC1/CR1 are unaffected by this because they have no leverage term. | `twfe.py::fit`, `estimators.py::DifferenceInDifferences.fit`, `estimators.py::MultiPeriodDiD.fit` | Phase 1a | Medium |
 | Weighted CR2 Bell-McCaffrey cluster-robust (`vcov_type="hc2_bm"` + `cluster_ids` + `weights`) currently raises `NotImplementedError`. Weighted hat matrix and residual rebalancing need threading per clubSandwich WLS handling. | `linalg.py::_compute_cr2_bm` | Phase 1a | Medium |
-| Regenerate `benchmarks/data/clubsandwich_cr2_golden.json` from R (`Rscript benchmarks/R/generate_clubsandwich_golden.R`). Current JSON has `source: python_self_reference` as a stability anchor until an authoritative R run. | `benchmarks/R/generate_clubsandwich_golden.R` | Phase 1a | Medium |
 | `honest_did.py:1907` `np.linalg.solve(A_sys, b_sys) / except LinAlgError: continue` is a silent basis-rejection in the vertex-enumeration loop that is algorithmically intentional (try the next basis). Consider surfacing a count of rejected bases as a diagnostic when ARP enumeration exhausts, so users see when the vertex search was heavily constrained. Not a silent failure in the sense of the Phase 2 audit (the algorithm is supposed to skip), but the diagnostic would help debug borderline cases. | `honest_did.py` | #334 | Low |
 | Unify Rust local-method `estimate_model` solver path to `solve_wls_svd` (the same SVD helper used by the global-method since PR #348) for sub-1e-14 bootstrap SE parity. Current local-method bootstrap parity test (`tests/test_rust_backend.py::TestTROPRustEdgeCaseParity::test_bootstrap_seed_reproducibility_local`) passes at `atol=1e-5` — the residual ~1e-7 gap is roundoff between Rust's `estimate_model` matrix factorization and numpy's `lstsq`, which accumulates differently across per-replicate bootstrap fits. Main-fit ATT parity is regime-dependent (`atol=1e-14` for `lambda_nn=inf`, `atol=1e-10` for finite `lambda_nn` — see `test_local_method_main_fit_parity`); the bootstrap gap is a same-solver-path roundoff concern and not a user-visible correctness bug. | `rust/src/trop.rs::estimate_model`, `rust/src/linalg.rs::solve_wls_svd` | follow-up | Low |
 | Rust multiplier-bootstrap weight RNG (`generate_bootstrap_weights_batch` in `rust/src/bootstrap.rs:9-10, 57-75`) uses `Xoshiro256PlusPlus::seed_from_u64(seed + i)` per row for Rademacher/Mammen/Webb generation. If any Python caller (SDID / efficient-DiD multiplier bootstrap) has a numpy-canonical equivalent, the two backends likely diverge under the same seed. Audit Python callers (`diff_diff/sdid.py`, `diff_diff/efficient_did_bootstrap.py`, `diff_diff/bootstrap_utils.py::generate_bootstrap_weights_batch_numpy`) for parity-test gaps. Same fix shape as TROP RNG parity (PR #354): pre-generate weights in Python via numpy and pass them to Rust through PyO3. | `rust/src/bootstrap.rs`, `diff_diff/bootstrap_utils.py` | follow-up | Medium |
@@ -166,7 +165,6 @@ Ordered paydown view across the tables above. Tier A → D is by effort × risk,
 
 #### Tier A — Quick wins (≤1 day, ≤3 CI rounds expected)
 
-- Regenerate `benchmarks/data/clubsandwich_cr2_golden.json` from R via `benchmarks/R/generate_clubsandwich_golden.R` (currently `source: python_self_reference`)
 - HonestDiD `test_m0_short_circuit`: replace wall-clock `elapsed < 0.5s` proxy with a state flag (`tests/test_methodology_honest_did.py:246`)
 - EfficientDiD `control_group="last_cohort"` REGISTRY-vs-code alignment with `anticipation>0` (`efficient_did.py`, one design decision)
 - TripleDifference: add `generate_ddd_panel_data` for panel DDD power analysis (`prep_dgp.py`, `power.py`)
diff --git a/benchmarks/R/generate_clubsandwich_golden.R b/benchmarks/R/generate_clubsandwich_golden.R
index f1673dda..1de9490c 100644
--- a/benchmarks/R/generate_clubsandwich_golden.R
+++ b/benchmarks/R/generate_clubsandwich_golden.R
@@ -7,7 +7,7 @@
 #   Rscript benchmarks/R/generate_clubsandwich_golden.R
 #
 # Requirements:
-#   clubSandwich (CRAN), jsonlite, readr
+#   clubSandwich (CRAN), jsonlite
 #
 # Output:
 #   benchmarks/data/clubsandwich_cr2_golden.json
@@ -50,12 +50,12 @@ for (nm in names(datasets)) {
   d <- datasets[[nm]]
   fit <- lm(y ~ x, data = d)
   vcov_cr2 <- vcovCR(fit, cluster = d$cluster, type = "CR2")
-  # Per-contrast Bell-McCaffrey DOF: one per coefficient via a unit contrast.
+  # Per-coefficient Bell-McCaffrey Satterthwaite DOF via coef_test()$df_Satt.
+  # (clubSandwich 0.7+ removed `Wald_test(..., test="Satterthwaite")`; the
+  # `df_Satt` column from coef_test() is the idiomatic per-coefficient form
+  # and is numerically identical to the old per-unit-contrast path.)
+  ct <- coef_test(fit, vcov = vcov_cr2)
   coef_names <- names(coef(fit))
-  dof_vec <- sapply(coef_names, function(nm_coef) {
-    ctr <- setNames(as.numeric(names(coef(fit)) == nm_coef), names(coef(fit)))
-    Wald_test(fit, constraints = matrix(ctr, 1), vcov = vcov_cr2, test = "Satterthwaite")$df
-  })
   output[[nm]] <- list(
     x = d$x,
     y = d$y,
@@ -64,7 +64,7 @@ for (nm in names(datasets)) {
     coef_names = coef_names,
     vcov_cr2 = as.numeric(vcov_cr2),
     vcov_shape = dim(vcov_cr2),
-    dof_bm = as.numeric(dof_vec),
+    dof_bm = as.numeric(ct$df_Satt),
     cluster_sizes = as.numeric(table(d$cluster))
   )
 }
diff --git a/benchmarks/data/clubsandwich_cr2_golden.json b/benchmarks/data/clubsandwich_cr2_golden.json
index 29d3d3dd..1e1b3417 100644
--- a/benchmarks/data/clubsandwich_cr2_golden.json
+++ b/benchmarks/data/clubsandwich_cr2_golden.json
@@ -1,480 +1,42 @@
 {
   "balanced_small": {
-    "x": [
-      0.9435325056105539,
-      0.35942103334157316,
-      0.7848054119699771,
-      0.5912781852294118,
-      0.2943285611969282,
-      0.9227256864229143,
-      0.8693315446785694,
-      0.36413842629052284,
-      0.973176814632894,
-      0.22452433073513123,
-      0.8054958679281221,
-      0.6808962312808441,
-      0.47106052122569997,
-      0.030805470551009684,
-      0.8947982030827969,
-      0.5736325238146748,
-      0.39030825765317734,
-      0.354679037214903,
-      0.6519730194604051,
-      0.3470284246126759,
-      0.50757990814917,
-      0.37093570271031107,
-      0.05520285394721014,
-      0.2504092003297004,
-      0.8409630382078536,
-      0.8181544146253424,
-      0.667325426235869,
-      0.4705875910980839,
-      0.9698444927156448,
-      0.8402607539796008
-    ],
-    "y": [
-      0.9124653444274926,
-      0.4072731282261453,
-      0.6441477445320447,
-      0.6245322729646807,
-      0.3940054202909884,
-      0.35804100091454355,
-      1.2073548572143984,
-      0.7569702431699881,
-      1.0735173904729616,
-      0.24506888765882986,
-      0.3155516277058717,
-      0.3214244220887048,
-      0.47584839029817083,
-      0.6428414009886432,
-      1.0784737561824687,
-      1.0295056172852604,
-      1.074654892499979,
-      0.7325318128152004,
-      1.5443613535937253,
-      1.2483762247493897,
-      1.3339598940009474,
-      1.1013735885136728,
-      1.2772239312210443,
-      1.3363012870469466,
-      0.9458188110658502,
-      0.8533370526745974,
-      1.0314299477607014,
-      1.102902757481314,
-      1.1075988192549155,
-      1.083261134760365
-    ],
-    "cluster": [
-      1,
-      1,
-      1,
-      1,
-      1,
-      1,
-      2,
-      2,
-      2,
-      2,
-      2,
-      2,
-      3,
-      3,
-      3,
-      3,
-      3,
-      3,
-      4,
-      4,
-      4,
-      4,
-      4,
-      4,
-      5,
-      5,
-      5,
-      5,
-      5,
-      5
-    ],
-    "coef": [
-      0.8339617023201237,
-      0.07167199961782436
-    ],
-    "coef_names": [
-      "(Intercept)",
-      "x"
-    ],
-    "vcov_cr2": [
-      0.09782504656993866,
-      -0.09922720822392328,
-      -0.09922720822392324,
-      0.11191805849057651
-    ],
-    "vcov_shape": [
-      2,
-      2
-    ],
-    "dof_bm": [
-      3.046739807991152,
-      3.867820152608498
-    ]
+    "x": [0.3721983763389289, 0.04382481542415917, 0.7096840182784945, 0.6576903965324163, 0.2498557232320309, 0.3000548330601305, 0.5848666259553283, 0.3334671433549374, 0.6220119637437165, 0.5458285543136299, 0.8797957301139832, 0.7068747407756746, 0.7319725945126265, 0.9316344279795885, 0.4551205944735557, 0.5903197291772813, 0.8204360948875546, 0.2241184804588556, 0.4116668293718249, 0.03861056081950665, 0.7007115453016013, 0.9568374615628272, 0.2133520019706339, 0.6610615001991391, 0.9233188820071518, 0.7957197614014149, 0.07121255435049534, 0.3894077676814049, 0.4064512162003666, 0.6593550781253725],
+    "y": [1.05667907248243, 1.066947846044605, 1.204576917404174, 0.9393918637932542, 1.177146043692329, 0.7712803978302757, 0.9609694640137826, 0.7179921799716888, 0.9795764040580304, 0.9476640198770669, 1.19400793499814, 0.9843903995036551, 1.596792196243911, 1.080428664799787, 1.494763722918098, 1.179517769839397, 1.473047550838304, 1.325359020870347, 0.4626772743433495, 0.7001639163946355, 1.038012356684575, 1.221226309323235, 0.2339753492538406, 0.8297946161223602, 0.2154287997879057, 0.4532916275621509, 0.1242393529242412, 0.02833331953496379, 0.1200535751172539, 0.007748016921758638],
+    "cluster": [1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5],
+    "coef": [0.6300385597851063, 0.4180976333279622],
+    "coef_names": ["(Intercept)", "x"],
+    "vcov_cr2": [0.05521248306042478, -0.03274214553627666, -0.03274214553627678, 0.06774697283981687],
+    "vcov_shape": [2, 2],
+    "dof_bm": [3.212621630843576, 3.575881216615637],
+    "cluster_sizes": [6, 6, 6, 6, 6]
   },
   "unbalanced_medium": {
-    "x": [
-      0.3462874999237423,
-      0.443655907974409,
-      0.815755133576451,
-      0.6874916807834863,
-      0.30042599121099556,
-      0.44014173776833065,
-      0.5635108182432734,
-      0.8367388142376376,
-      0.2123988054077447,
-      0.914502769923609,
-      0.31540445328100175,
-      0.7782568094004593,
-      0.5247419459408266,
-      0.3489167128946181,
-      0.557303561462073,
-      0.48586633346308306,
-      0.1524714507792745,
-      0.44748254053693426,
-      0.9278430129623557,
-      0.6581790789559127,
-      0.11995237785491086,
-      0.7014607669463622,
-      0.6993877122197518,
-      0.4949398323613621,
-      0.02940081961158314,
-      0.5543191925927794,
-      0.2823434957508282,
-      0.5427086647116763,
-      0.38029675833106513,
-      0.4025733964767805,
-      0.9375123829093304,
-      0.9440645103303017,
-      0.1724656629410518,
-      0.246872424727282,
-      0.9844803926901483,
-      0.5559872436147494,
-      0.6878771576696437,
-      0.5882087419536575,
-      0.12223333644812606,
-      0.5527777076815845,
-      0.04316256681068387,
-      0.0010296217941053731,
-      0.019728001928271177,
-      0.5033598479801504,
-      0.08921189595557244,
-      0.03511513041575487,
-      0.09666762523940087,
-      0.5241598869956512,
-      0.6325041926946793,
-      0.2154994386324135,
-      0.9150819636448457,
-      0.18191716317896556
-    ],
-    "y": [
-      0.206431750468072,
-      0.52914126374223,
-      0.948372476430183,
-      1.462848242343694,
-      1.1787490896437192,
-      1.2232918502435313,
-      1.49883714029803,
-      1.6892231717734079,
-      1.0692241224594425,
-      1.7836788181223873,
-      1.7498094746587713,
-      1.6074538152426974,
-      1.90081335871295,
-      1.8000176179263754,
-      1.7859040936875021,
-      1.8352051679895616,
-      1.4032402426917756,
-      1.7463091897477208,
-      1.8233311000151493,
-      1.373229624702863,
-      0.8988692201058794,
-      1.7028444873926833,
-      1.1367695830608349,
-      1.6372544722023115,
-      1.0930977055066182,
-      2.193470097143512,
-      2.004641154578546,
-      2.0681249662828667,
-      2.1985314952996164,
-      1.9785659459956395,
-      2.097973041979551,
-      2.273697189865885,
-      1.529964015810914,
-      1.7325466539629746,
-      1.7756476746468373,
-      1.9205833759797464,
-      2.1275707263673027,
-      2.061119528784678,
-      1.6714765399702147,
-      1.3753867950348604,
-      1.5654904578276585,
-      1.8951529649563046,
-      0.5616087105402756,
-      1.0365570638248616,
-      0.5889041512587929,
-      1.010971201580437,
-      0.30497699123241295,
-      0.8359690569328175,
-      0.9175163185192178,
-      0.8870270600603117,
-      1.1408269122291879,
-      0.7996542383725431
-    ],
-    "cluster": [
-      1,
-      1,
-      1,
-      2,
-      2,
-      2,
-      2,
-      3,
-      3,
-      3,
-      3,
-      3,
-      4,
-      4,
-      4,
-      4,
-      4,
-      4,
-      5,
-      5,
-      5,
-      5,
-      5,
-      5,
-      5,
-      6,
-      6,
-      6,
-      6,
-      6,
-      6,
-      6,
-      6,
-      7,
-      7,
-      7,
-      7,
-      7,
-      7,
-      7,
-      7,
-      7,
-      8,
-      8,
-      8,
-      8,
-      8,
-      8,
-      8,
-      8,
-      8,
-      8
-    ],
-    "coef": [
-      1.1239728426835145,
-      0.7158670108521947
-    ],
-    "coef_names": [
-      "(Intercept)",
-      "x"
-    ],
-    "vcov_cr2": [
-      0.0769099709181706,
-      -0.0648802497792549,
-      -0.0648802497792549,
-      0.07456720905801277
-    ],
-    "vcov_shape": [
-      2,
-      2
-    ],
-    "dof_bm": [
-      4.0976627009493924,
-      4.560090785535048
-    ]
+    "x": [0.128862570039928, 0.6701512336730957, 0.3297898611053824, 0.2037271915469319, 0.368284564698115, 0.7823035586625338, 0.1984838922508061, 0.2190779817756265, 0.4419498639181256, 0.6422768677584827, 0.07627971330657601, 0.2270524124614894, 0.2186972440686077, 0.7315562770236284, 0.02659884956665337, 0.2013396944385022, 0.6228771209716797, 0.6576305113267154, 0.6803408381529152, 0.1647861595265567, 0.4449463130440563, 0.5335798722226173, 0.8809765174519271, 0.6302034477703273, 0.218909045914188, 0.4129483439028263, 0.457392189418897, 0.4145040600560606, 0.9525077047292143, 0.8236416757572442, 0.3981174814980477, 0.1786765605211258, 0.9758388390764594, 0.8111260565929115, 0.2872961156535894, 0.462518839398399, 0.8308089969214052, 0.4791092942468822, 0.9257831228896976, 0.5312510745134205, 0.6328634682577103, 0.8924754566978663, 0.8108281444292516, 0.2660467734094709, 0.8168796338140965, 0.306065121665597, 0.7733509626705199, 0.897745214169845, 0.6402185300830752, 0.2910342407412827, 0.5996965372469276, 0.03418785543181002],
+    "y": [1.281270710066877, 1.722881312144584, 1.281775401696968, 1.413528483141223, 1.043366086227046, 1.45241495488675, 1.675613928059604, 0.8983891882314028, 0.8686731426084141, 1.093944759246245, 0.9249145411811377, 0.8493184917053391, 2.85305770155655, 2.514730609257615, 2.436761333058375, 2.341302160178671, 2.505201653980506, 2.944303753193914, 2.227941048801415, 2.412340826904834, 2.207925805301888, 2.2923019573263, 2.342348481613817, 2.226590726847836, 2.086344787976651, 1.656871526710177, 1.904644888582141, 1.862885866969904, 1.964874740458616, 1.520180307273704, 1.685953582170327, 1.515607792243089, 2.005418184074291, 0.8171658134910009, 0.6646849084479258, 1.019695539332419, 1.070006641198372, 0.7096919907678358, 1.075733955227796, 0.8546317660179634, 0.903663211647689, 0.9348092890610543, 0.4821668914973627, -0.04429208656038075, 0.3016826155382485, 0.4284240023231432, 0.663698185175872, 0.9351900235072026, 0.5567615926932054, 0.4072033162858779, 0.1744773447144307, 0.5377150345076495],
+    "cluster": [1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
+    "coef": [1.407664197609156, -0.02639903174573305],
+    "coef_names": ["(Intercept)", "x"],
+    "vcov_cr2": [0.1192052700634108, -0.04806533789647253, -0.04806533789647224, 0.09454689012283392],
+    "vcov_shape": [2, 2],
+    "dof_bm": [6.387834602264484, 5.850537239807559],
+    "cluster_sizes": [3, 4, 5, 6, 7, 8, 9, 10]
   },
   "singletons_present": {
-    "x": [
-      0.21443241914065292,
-      0.41682182086464903,
-      0.807695239893341,
-      0.27392327582317477,
-      0.8157795313236993,
-      0.10763306403508877,
-      0.43640245442460357,
-      0.8388322165462259,
-      0.19866476008995304,
-      0.3026576176172062,
-      0.3431542671793545,
-      0.21186815334857012,
-      0.9208492603469947,
-      0.5571125190421009,
-      0.8457028297541836,
-      0.5345114831042415,
-      0.24845868416282413,
-      0.2642429245246207,
-      0.9465367828433652,
-      0.20945542694752117,
-      0.009759083685863867,
-      0.6259345454282155,
-      0.3392297185756302,
-      0.46294549338368507,
-      0.1317216186447715,
-      0.48751512348895587,
-      0.15110039365510097,
-      0.630530192366627,
-      0.006966019961796244,
-      0.5849538716054395,
-      0.5801774527180487,
-      0.5915577625702408,
-      0.679731222994373,
-      0.3730404645361476,
-      0.7146163443155258,
-      0.800633446837909,
-      0.8946270764765526,
-      0.5683047391141852,
-      0.9479855397490015,
-      0.8267433552654698,
-      0.5713537595492587,
-      0.0243267381892448,
-      0.7476080024405406,
-      0.33535028127617283,
-      0.7698511234427318,
-      0.4701191328296702
-    ],
-    "y": [
-      1.3243096175984646,
-      0.8431396417660063,
-      1.4419843094572695,
-      1.2870746718828974,
-      1.596866114476496,
-      1.1552637530029242,
-      1.6100814232512937,
-      1.7512158081862361,
-      1.532545529434858,
-      1.160147918245946,
-      1.453429043900455,
-      0.3908349902873533,
-      0.942299040729968,
-      0.5055662291402832,
-      0.5005710305092965,
-      0.6181560617843084,
-      0.021769149201485205,
-      0.16713924693374213,
-      0.673423433392893,
-      0.5733752379061821,
-      -0.0468281630117352,
-      0.45355321026007617,
-      1.519957688127892,
-      1.269751930094416,
-      1.5788986239640754,
-      1.55947674117304,
-      1.4519195876176205,
-      1.7886169479007779,
-      1.1911132158800422,
-      0.18966340391475178,
-      0.7023759773342588,
-      0.42401023482172623,
-      0.6938869588679403,
-      0.6120292225026871,
-      0.5707096718476062,
-      0.34871206846203284,
-      0.9559282948976917,
-      1.69305984433428,
-      1.8182487816381958,
-      1.9520010432247459,
-      1.921437822619558,
-      1.5786029246631785,
-      1.755924603719404,
-      1.8120520346821916,
-      1.6968331198272542,
-      1.7746413567714139
-    ],
-    "cluster": [
-      1,
-      2,
-      3,
-      3,
-      4,
-      4,
-      4,
-      5,
-      5,
-      5,
-      5,
-      6,
-      6,
-      6,
-      6,
-      6,
-      7,
-      7,
-      7,
-      7,
-      7,
-      7,
-      8,
-      8,
-      8,
-      8,
-      8,
-      8,
-      8,
-      9,
-      9,
-      9,
-      9,
-      9,
-      9,
-      9,
-      9,
-      10,
-      10,
-      10,
-      10,
-      10,
-      10,
-      10,
-      10,
-      10
-    ],
-    "coef": [
-      1.0020238985950929,
-      0.20694026200922197
-    ],
-    "coef_names": [
-      "(Intercept)",
-      "x"
-    ],
-    "vcov_cr2": [
-      0.08616619395052143,
-      -0.07752403165758125,
-      -0.07752403165758119,
-      0.19494058151826515
-    ],
-    "vcov_shape": [
-      2,
-      2
-    ],
-    "dof_bm": [
-      5.408661566845492,
-      5.965130041512546
-    ]
+    "x": [0.3096159978304058, 0.8345736733172089, 0.6098155525978655, 0.7399769551120698, 0.6496841341722757, 0.9805952606257051, 0.5273389085195959, 0.4678991283290088, 0.2815611527767032, 0.2787613344844431, 0.9055589258205146, 0.8726157851051539, 0.1036925727967173, 0.03638615319505334, 0.2336558362003416, 0.2130224851425737, 0.4085248110350221, 0.2336621598806232, 0.7284043852705508, 0.654739553341642, 0.8465219340287149, 0.6553786264266819, 0.8273450890555978, 0.2404767158441246, 0.8276755115948617, 0.6109203656669706, 0.8021095618605614, 0.1160071832127869, 0.2334099512081593, 0.367526771267876, 0.1624952557031065, 0.04603013768792152, 0.7806035145185888, 0.2781105572357774, 0.3106493707746267, 0.2741349330171943, 0.3256947943009436, 0.4693535177502781, 0.7107607442885637, 0.8296667435206473, 0.7135459324344993, 0.3497182361315936, 0.6398582113906741, 0.5172503236681223, 0.1625131070613861, 0.7787523062434047],
+    "y": [0.950136668408636, 1.754592223394735, 1.122852938797655, 1.636705917889595, 1.33602946824798, 1.194648962092246, 1.148033096402285, 0.9885434625790169, 0.7050616623244204, 0.7449203796991308, 0.8522637796426105, 2.487209473904, 1.976909132715369, 2.347476489585664, 1.967524934768897, 2.602879791128856, 1.220252609394451, 1.332954036617961, 1.740259975846541, 1.115942116955388, 1.740710236232752, 1.506143920276894, 0.7486150649323104, 0.6946189692315179, 0.9177986725014496, 1.285459212606915, 1.339739549860439, 0.673901009685542, 0.4746185839128876, 0.7341158063604124, 0.7481991133308572, 0.3787909792265402, 0.8604465686455036, 1.011410595825842, 0.8174743408007981, 0.7755017956919983, 0.881726590825053, 1.302321464323421, 1.60501077060975, 1.948830729098463, 1.385133092418095, 1.418004152520744, 1.20172718727563, 1.009673937225518, 1.20690782633116, 1.588919597999956],
+    "cluster": [1, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10],
+    "coef": [1.054441705396782, 0.3911987276515195],
+    "coef_names": ["(Intercept)", "x"],
+    "vcov_cr2": [0.1616023734490589, -0.1899859611041032, -0.1899859611041032, 0.238142463018284],
+    "vcov_shape": [2, 2],
+    "dof_bm": [4.423211980441538, 5.952656131218101],
+    "cluster_sizes": [1, 1, 2, 3, 4, 5, 6, 7, 8, 9]
   },
   "meta": {
-    "source": "python_self_reference",
-    "note": "Regression anchor bootstrapped from diff_diff._compute_cr2_bm. Run benchmarks/R/generate_clubsandwich_golden.R to overwrite with authoritative clubSandwich values."
+    "source": "clubSandwich",
+    "clubSandwich_version": "0.7.0",
+    "R_version": "R version 4.5.2 (2025-10-31)",
+    "generated_at": "2026-05-16 10:20:19 UTC",
+    "note": "CR2 Bell-McCaffrey cluster-robust parity target for diff_diff._compute_cr2_bm"
   }
-}
\ No newline at end of file
+}
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
index 099b8059..4913903a 100644
--- a/docs/methodology/REGISTRY.md
+++ b/docs/methodology/REGISTRY.md
@@ -2540,7 +2540,7 @@ Shipped in `diff_diff/had_pretests.py` as `stute_joint_pretest()` (residuals-in
     - **Note (scope limitation on absorbed FE):** HC2 and HC2 + Bell-McCaffrey are rejected on any estimator that uses within-transformation (demeaning) for fixed effects: `TwoWayFixedEffects` unconditionally; `DifferenceInDifferences(absorb=..., vcov_type in {"hc2","hc2_bm"})`; `MultiPeriodDiD(absorb=..., vcov_type in {"hc2","hc2_bm"})`. FWL preserves coefficients and residuals under within-transformation but NOT the hat matrix: `h_ii = x_i' (X'X)^{-1} x_i` on the reduced design is not the diagonal of the full FE projection, and CR2's block adjustment `A_g = (I - H_gg)^{-1/2}` likewise depends on the full cluster-block hat matrix. Applying the reduced-design leverage would silently mis-state small-sample SEs/DOF, so the combinations raise `NotImplementedError` with a pointer to workarounds: use `vcov_type="hc1"` (HC1/CR1 have no leverage term and survive FWL), or switch to `fixed_effects=` dummies so the hat matrix is computed on the full design. Lifting the guard requires computing HC2/CR2-BM from the full absorbed projection and validating it against a full-dummy or `fixest`/`clubSandwich` reference. Tracked in `TODO.md` under Methodology/Correctness.
 - [x] Phase 1a: `vcov_type` enum threaded through `DifferenceInDifferences` (`MultiPeriodDiD`, `TwoWayFixedEffects` inherit); `robust=True` <=> `vcov_type="hc1"`, `robust=False` <=> `vcov_type="classical"`. Conflict detection at `__init__`. Results summary prints the variance-family label.
     - **Note (deviation from the fully-symmetric enum):** `MultiPeriodDiD(cluster=..., vcov_type="hc2_bm")` is intentionally **not supported** and raises `NotImplementedError`. The scalar-coefficient `DifferenceInDifferences` path handles the cluster + CR2 Bell-McCaffrey combination (`_compute_cr2_bm` returns a per-coefficient Satterthwaite DOF that is valid for the single-ATT contrast), but `MultiPeriodDiD` also reports a post-period-average ATT constructed as a *contrast* of the event-study coefficients. The cluster-aware CR2 BM DOF for that contrast (i.e., the Pustejovsky-Tipton 2018 per-cluster adjustment matrices applied to an arbitrary aggregation contrast) is not yet implemented. Pairing CR2 cluster-robust SEs with the one-way Imbens-Kolesar (2016) contrast DOF would be a broken hybrid, so the combination fails fast with a clear workaround message (drop the cluster for one-way HC2+BM, or use `vcov_type="hc1"` with cluster for CR1 Liang-Zeger). Tracked in `TODO.md` under Methodology/Correctness. Applies only to `MultiPeriodDiD`; `DifferenceInDifferences(cluster=..., vcov_type="hc2_bm")` works.
-- [x] Phase 1a: `clubSandwich::vcovCR(..., type="CR2")` parity harness committed: R script at `benchmarks/R/generate_clubsandwich_golden.R` plus a regression-anchor JSON at `benchmarks/data/clubsandwich_cr2_golden.json`. **Note:** the committed JSON currently has `"source": "python_self_reference"` and pins numerical stability only; authoritative R-produced values are generated by running the R script, which the TODO.md row under Methodology/Correctness tracks. The parity test at `tests/test_linalg_hc2_bm.py::TestCR2BMCluster::test_cr2_parity_with_golden` runs at 1e-6 tolerance (Phase 1a plan commits 6-digit parity once R regen completes).
+- [x] Phase 1a: `clubSandwich::vcovCR(..., type="CR2")` parity harness committed: R script at `benchmarks/R/generate_clubsandwich_golden.R` plus the authoritative R-generated JSON at `benchmarks/data/clubsandwich_cr2_golden.json` (`"source": "clubSandwich"`, with `clubSandwich_version`, `R_version`, and `generated_at` captured in `meta` for forensic traceability). The parity test at `tests/test_linalg_hc2_bm.py::TestCR2BMCluster::test_cr2_parity_with_golden` runs at 1e-6 tolerance and passes at ≤ 7.1e-15 across all three datasets — Python's `_compute_cr2_bm` matches clubSandwich at machine precision.
 - [x] Phase 1b: Calonico-Cattaneo-Farrell (2018) MSE-optimal bandwidth selector. In-house port of `nprobust::lpbwselect(bwselect="mse-dpi")` (nprobust 0.5.0, SHA `36e4e53`) as `diff_diff.mse_optimal_bandwidth` and `BandwidthResult`, backed by the private `diff_diff._nprobust_port` module (`kernel_W`, `lprobust_bw`, `lpbwselect_mse_dpi`). Three-stage DPI with four `lprobust.bw` calls at orders `q+1`, `q+2`, `q`, `p`. Python matches R to `0.0000%` relative error (i.e., bit-parity within float64 precision, ~8-13 digits agreement) on all five stage bandwidths (`c_bw`, `bw_mp2`, `bw_mp3`, `b_mse`, `h_mse`) across three deterministic DGPs (uniform, Beta(2,2), half-normal) via `benchmarks/R/generate_nprobust_golden.R` → `benchmarks/data/nprobust_mse_dpi_golden.json`. **Note:** `weights=` is currently unsupported (raises `NotImplementedError`); nprobust's `lpbwselect` has no weight argument so there is no parity anchor. Weighted-data support deferred to Phase 2 (survey-design adaptation). **Note (public API scope restriction):** the exported wrapper `mse_optimal_bandwidth` hard-codes the HAD Phase 1b configuration (`p=1`, `deriv=0`, `interior=False`, `vce="nn"`, `nnmatch=3`). The underlying port supports a broader surface (`hc0`/`hc1`/`hc2`/`hc3` variance, interior evaluation, higher `p`), but those paths are not parity-tested against `nprobust` and are deferred. Callers needing the broader surface should use `diff_diff._nprobust_port.lpbwselect_mse_dpi` directly and accept that parity has not been verified on non-HAD configurations. **Note (input contract):** the wrapper enforces HAD's support restriction `D_{g,2} >= 0` (front-door `ValueError` on negative doses and empty inputs). `boundary` must equal `0` (Design 1') or `float(d.min())` (Design 1 continuous-near-d_lower) within float tolerance; off-support values raise `ValueError`. When `boundary ~ 0`, the wrapper additionally requires `d.min() <= 0.05 * median(|d|)` as a Design 1' support plausibility heuristic, chosen to pass the paper's thin-boundary-density DGPs (Beta(2,2), d.min/median ~ 3%) while rejecting substantially off-support samples (U(0.5, 1.0), d.min/median ~ 1.0). Detected mass-point designs (`d.min() > 0` with modal fraction at `d.min() > 2%`) raise `NotImplementedError` pointing to the Phase 2 2SLS path per paper Section 3.2.4.
 - [x] Phase 1c: First-order bias estimator `M̂_{ĥ*_G}` and robust variance `V̂_{ĥ*_G}`. Implemented via Calonico-Cattaneo-Titiunik (2014) bias-combined design matrix `Q.q` in the in-house port `diff_diff._nprobust_port.lprobust` (single-eval-point path of `nprobust::lprobust`, npfunctions.R:177-246).
 - [x] Phase 1c: Bias-corrected CI (Equation 8) with `nprobust` parity. Public wrapper `diff_diff.bias_corrected_local_linear` returns `BiasCorrectedFit` with μ̂-scale point estimate, robust SE, and bias-corrected 95% CI `[tau.bc ± z_{1-α/2} * se.rb]`. The β-scale rescaling from Equation 8, `(1/G) Σ D_{g,2}`, is applied by Phase 2's `HeterogeneousAdoptionDiD.fit()`. Parity against `nprobust::lprobust(..., bwselect="mse-dpi")` is asserted at `atol=1e-12` on `tau_cl`/`tau_bc`/`se_cl`/`se_rb`/`ci_low`/`ci_high` across the three unclustered golden DGPs (DGP 1 and DGP 3 typically land closer to `1e-13`). The Python wrapper computes its own `z_{1-α/2}` via `scipy.stats.norm.ppf` inside `safe_inference()`; R's `qnorm` value is stored in the golden JSON for audit, and the parity harness compares Python's CI bounds to R's pre-computed CI bounds so any residual drift is purely the floating-point arithmetic in `tau.bc ± z * se.rb`, not a critical-value disagreement. The clustered DGP achieves bit-parity (`atol=1e-14`) when cluster IDs are in first-appearance order; otherwise BLAS reduction ordering can drift to `atol=1e-10`. Generator: `benchmarks/R/generate_nprobust_lprobust_golden.R`. **Note:** The wrapper matches nprobust's `rho=1` default (`b = h` in auto mode), so Phase 1b's separately-computed `b_mse` is surfaced via `bandwidth_diagnostics.b_mse` but not applied. **Note (public-API surface restriction):** Phase 1c restricts the public wrapper's `vce` parameter to `"nn"`; hc0/hc1/hc2/hc3 raise `NotImplementedError` and are queued for Phase 2+ pending dedicated R parity goldens. The port-level `diff_diff._nprobust_port.lprobust` still accepts all five vce modes (matching R's `nprobust::lprobust` signature) for callers who need the broader surface and accept that the hc-mode variance path — which reuses p-fit hat-matrix leverage for the q-fit residual in R (lprobust.R:229-241) — has not been separately parity-tested. **Note (Phase 1c internal bug workaround):** The clustered golden DGP 4 uses manual `h=b=0.3` to sidestep an nprobust-internal singleton-cluster shape bug in `lprobust.vce` fired by the mse-dpi pilot fits; the Python port has no equivalent bug.
diff --git a/tests/test_linalg_hc2_bm.py b/tests/test_linalg_hc2_bm.py
index 5a0f21e4..c9706931 100644
--- a/tests/test_linalg_hc2_bm.py
+++ b/tests/test_linalg_hc2_bm.py
@@ -524,11 +524,12 @@ def test_cr2_bm_singleton_clusters(self):
     def test_cr2_parity_with_golden(self):
         """Parity against benchmarks/data/clubsandwich_cr2_golden.json.
 
-        The golden values are authoritative once regenerated by
-        benchmarks/R/generate_clubsandwich_golden.R (clubSandwich source);
-        until then the JSON is a self-reference anchor that pins numerical
-        stability. Test tolerance is 1e-6, well within the 6-digit parity
-        target stated in the Phase 1a plan.
+        The JSON is the authoritative clubSandwich-generated fixture
+        (regenerated via benchmarks/R/generate_clubsandwich_golden.R;
+        `meta.source = "clubSandwich"`). Test tolerance is 1e-6, well
+        within the 6-digit parity target stated in the Phase 1a plan;
+        empirically Python matches clubSandwich at ≤ 7.1e-15 across all
+        three datasets.
         """
         import json
         from pathlib import Path