igerber · igerber · Jun 30, 2026 · Jun 30, 2026 · Jun 30, 2026 · Jun 30, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,6 +8,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Added
+- **`LPDiD` complex-survey-design support** (Phase D1). Adds a `survey_design=` argument to
+  `LPDiD.fit()` (a `SurveyDesign` with probability weights + optional strata/PSU/FPC). On the
+  variance-weighted default path the long-difference regression at each horizon is fit by WLS on
+  the survey weights, and the standard error is the stratified-PSU Taylor-linearization (Binder
+  TSL) sandwich with `df = n_PSU - n_strata`, reusing `diff_diff/survey.py`
+  (`compute_survey_vcov` / `_compute_stratified_psu_meat`). The design is re-resolved on each
+  realized (post-clean-control) sample so weights/strata/PSU align with the regression rows; with
+  no explicit PSU the unit (LP-DiD's default cluster) is injected as the PSU. Rejects
+  `survey_design` combined with `reweight=True` (the equally-weighted / regression-adjustment
+  influence-function path), replicate-weight designs, and non-pweight (fweight/aweight) types,
+  each a deferred follow-up. `LPDiDResults` gains `survey_metadata` / `n_strata` / `n_psu`, a
+  `"survey_tsl"` `vcov_type`, and a Survey Design block in `summary()`. The non-survey path is
+  byte-for-byte unchanged. Validated against `survey::svyglm` on the stacked long difference
+  (numeric golden parity is the D2 follow-up).
 - **`LPDiD` non-absorbing R-parity validation** (Phase C2). Pins both non-absorbing modes
   against an independent `fixest::feols` reconstruction of the paper's Eq. 12 (`first_entry`)
   and Eq. 13 (`effect_stabilization`) clean-sample restrictions: variance-weighted point and

diff --git a/TODO.md b/TODO.md
@@ -70,6 +70,7 @@ generic sparse-FE, QR+SVD rank-detection redundancy, `check_finite` bypass — m
 | `ImputationDiD` covariate-path variance lacks a dedicated parity anchor — only the no-covariate staggered panel is R-parity'd, though the covariate path shares the same validated projection code. Add a small dense-design **hand-calc** for the covariate projection (no external tooling), or a covariate (time-varying X) R `didimputation` golden asserting overall/ES SE parity (the golden variant needs local R). | `tests/test_methodology_imputation.py`, `benchmarks/R/generate_didimputation_golden.R` | imputation-validation | Mid | Low |
 | Add true half-sample BRR replicate-weight regressions per estimator family (current tests use Fay-like 0.5/1.5 perturbations; `test_survey_phase6.py` covers true BRR at the helper level). | `tests/test_replicate_weight_expansion.py` | #253 | Mid | Low |
 | Port the CI `<notebook-prose>` extraction into the reviewer-eval harness so `docs/tutorials/*.ipynb` cases (currently guarded out of `verify-corpus`/`run`) can be reviewed with CI-equivalent context. | `tools/reviewer-eval/adapters/ci_prompt.py` | local-review | Mid | Low |
+| **`LPDiD` survey-design R-parity (PR-D2)** — pin the stratified-PSU Taylor-linearization standard errors against `survey::svyglm` on the stacked long difference (per-horizon + pooled-post point/SE/df), mirroring `benchmark_survey_estimators.R`. The D1 build ships the survey path validated by pure-Python invariants (reduction/FPC/stratification/lonely-PSU/NaN-consistency); D2 adds the numeric `svyglm` golden + parity test. | `benchmarks/R/`, `tests/test_methodology_lpdid.py` | PR-D1 | Mid | Low |
 
 ---
 
@@ -117,6 +118,7 @@ exists but parity can't be verified without a local toolchain.
 | **`bias_corrected_local_linear` (lprobust) Phase-1c follow-ups:** extend golden parity to `kernel ∈ {triangular, uniform}` (epa-only today); expose `vce ∈ {hc0,hc1,hc2,hc3}` on the public wrapper once R goldens exist (port supports all four; needs a per-mode generator + a hc2/hc3 q-fit-leverage decision); clustered-DGP auto-bandwidth parity is **blocked upstream** on an nprobust singleton-cluster bug in `lpbwselect.mse.dpi` (Phase-1c DGP 4 uses manual `h=b=0.3`). | `_nprobust_port.py`, `local_linear.py`, `generate_nprobust_lprobust_golden.R` | Phase 1c | Low-Med |
 | `HeterogeneousAdoptionDiD` Stute-family Stata-bridge parity: no public R `Stutetest` package exists; would add `benchmarks/stata/generate_stute_golden.do` + a Stata dependency. | `benchmarks/stata/`, `tests/test_stute_test_parity.py` | follow-up | Low |
 | **`LPDiD` regression-adjustment SE — no runnable R reference.** The RA influence-function cluster SE is canonically Stata `teffects ra ... atet vce(cluster)` only; no R package computes it (`alexCardazzi/lpdid` does direct covariate inclusion, not RA). Today the RA *point* is R-anchored (~1e-12), the SE is pinned + MC-coverage-validated (`coverage_lpdid_ra.py`). Follow-up: contribute the RA path to `alexCardazzi/lpdid` so a runnable R RA reference exists — only a *trusted* anchor once cross-checked vs Stata `teffects` (else circular). | `tests/test_methodology_lpdid.py`, `benchmarks/python/coverage_lpdid_ra.py` | #B2 follow-up | Low |
+| **`LPDiD` survey scope gaps (PR-D1 deferrals).** Survey support covers the variance-weighted default path only. (a) `survey_design` + `reweight=True` (the equally-weighted / regression-adjustment IF path) is rejected: the weighted RA influence-function variance has **no runnable survey reference** (same class as the RA-SE row above - `survey::svyglm` anchors only the OLS/WLS path). (b) Replicate-weight survey designs (BRR/Fay/JK1/JKn/SDR) and (c) non-pweight (fweight/aweight) types are rejected pending demand. | `lpdid.py`, REGISTRY #8 | PR-D1 | Low |
 | **`LPDiD` non-absorbing R-parity - DONE (PR-C2)** via an independent `fixest::feols` Eq. 12/13 reconstruction (point+SE ~1e-13/~1e-15 vw; `effect_stabilization` reweighted point + pinned SE). `alexCardazzi/lpdid`'s `nonabsorbing_lag` proved NOT a faithful Eq. 13 (off-switch clamp + non-paper boundary/placebo window; diverges ~0.01-0.05 even on a monotone panel), so it is recorded as a divergent reference, not a gate. **Residual external-reference gap:** the authors' canonical non-absorbing SE/RA is Stata `lpdid`/`teffects` only (no faithful R analogue) - same class as the absorbing RA-SE row above; revisit if a Stata toolchain or a corrected R package appears. | `benchmarks/R/generate_lpdid_golden.R`, `tests/test_methodology_lpdid.py` | PR-C2 | Low |
 | `HeterogeneousAdoptionDiD` Phase-3 R-parity: ships coverage-rate validation on synthetic DGPs, not tight point parity vs `chaisemartin::stute_test` / `yatchew_test` (needs bootstrap-seed-semantics + `B` alignment across numpy/R). | `tests/test_had_pretests.py` | Phase 3 | Low |
 

diff --git a/diff_diff/guides/llms-full.txt b/diff_diff/guides/llms-full.txt
@@ -897,7 +897,7 @@ results.print_summary()
 
 ### LPDiD
 
-Local Projections DiD (Dube, Girardi, Jorda & Taylor 2025). Estimates a separate OLS at each event-time horizon of a long difference (`y_{i,t+h} - y_{i,t-1}`) on the treatment-switch indicator plus calendar-time fixed effects (no unit FE), restricted to a flexible "clean control" sample of newly-treated and not-yet-treated units. Excluding already-treated units from the control group removes the negative-weighting bias of naive TWFE, so the default (variance-weighted) estimand has strictly non-negative weights. `reweight=True` yields the equally-weighted ATT (numerically equivalent to Callaway-Sant'Anna); covariates then enter via regression adjustment. Standard errors on the default/weighted path are cluster-robust at the unit level (the paper specifies no SE; matches Stata `lpdid` `vce(cluster unit)`); the regression-adjustment covariate path (`reweight=True`) instead reports an influence-function cluster variance (ImputationDiD/BJS family). Scope: binary treatment; absorbing by default (rejects panels where treatment turns off), with non-absorbing (reversible) treatment available via `non_absorbing` - `"first_entry"` (Dube et al. Eq. 12, the effect of entering for the first time and staying treated) or `"effect_stabilization"` (Eq. 13, requires `stabilization_window=L`; lets units whose treatment has been stable for at least `L` periods act as clean controls, so estimation is feasible with few/no never-treated units). Non-absorbing modes require a gap-free panel within each unit's observed span.
+Local Projections DiD (Dube, Girardi, Jorda & Taylor 2025). Estimates a separate OLS at each event-time horizon of a long difference (`y_{i,t+h} - y_{i,t-1}`) on the treatment-switch indicator plus calendar-time fixed effects (no unit FE), restricted to a flexible "clean control" sample of newly-treated and not-yet-treated units. Excluding already-treated units from the control group removes the negative-weighting bias of naive TWFE, so the default (variance-weighted) estimand has strictly non-negative weights. `reweight=True` yields the equally-weighted ATT (numerically equivalent to Callaway-Sant'Anna); covariates then enter via regression adjustment. Standard errors on the default/weighted path are cluster-robust at the unit level (the paper specifies no SE; matches Stata `lpdid` `vce(cluster unit)`); the regression-adjustment covariate path (`reweight=True`) instead reports an influence-function cluster variance (ImputationDiD/BJS family). Scope: binary treatment; absorbing by default (rejects panels where treatment turns off), with non-absorbing (reversible) treatment available via `non_absorbing` - `"first_entry"` (Dube et al. Eq. 12, the effect of entering for the first time and staying treated) or `"effect_stabilization"` (Eq. 13, requires `stabilization_window=L`; lets units whose treatment has been stable for at least `L` periods act as clean controls, so estimation is feasible with few/no never-treated units). Non-absorbing modes require a gap-free panel within each unit's observed span. Complex-survey designs are supported on the variance-weighted default path via the `survey_design=` argument to `fit()` (probability weights enter the WLS point estimate; the SE is the stratified-PSU Taylor-linearization sandwich with `df = n_PSU - n_strata`, with optional FPC and lonely-PSU handling) — rejected with `reweight=True`, replicate weights, or non-pweight types.
 
 ```python
 LPDiD(
@@ -932,6 +932,7 @@ lpdid.fit(
     pre_pooled: int | tuple = None,    # Pooled pre-window horizons (int or (start, end))
     only_event: bool = False,          # Compute only the event-study table
     only_pooled: bool = False,         # Compute only the pooled pre/post table
+    survey_design: SurveyDesign = None,  # Complex-survey design (pweight + optional strata/PSU/FPC); variance-weighted default path only (rejected with reweight=True)
 ) -> LPDiDResults
 ```
 

diff --git a/diff_diff/guides/llms.txt b/diff_diff/guides/llms.txt
@@ -69,7 +69,7 @@ Full practitioner guide: call `diff_diff.get_llm_guide("practitioner")`
 - [TROP](https://diff-diff.readthedocs.io/en/stable/api/trop.html): Triply Robust Panel estimator (Athey et al. 2025) with nuclear norm factor adjustment
 - [StaggeredTripleDifference](https://diff-diff.readthedocs.io/en/stable/api/staggered.html#staggeredtripledifference): Ortiz-Villavicencio & Sant'Anna (2025) staggered DDD with group-time ATT
 - [WooldridgeDiD](https://diff-diff.readthedocs.io/en/stable/api/wooldridge_etwfe.html): Wooldridge (2023, 2025) ETWFE — saturated OLS, logit/Poisson QMLE (ASF-based ATT). Alias: ETWFE
-- [LPDiD](https://diff-diff.readthedocs.io/en/stable/api/lpdid.html): Dube, Girardi, Jorda & Taylor (2025) Local Projections DiD: per-horizon long-difference event study on clean controls (no negative weighting); variance- or equally-weighted ATT, premean differencing, pooled pre/post, fast. Absorbing by default; non-absorbing (reversible) treatment via `non_absorbing="first_entry"` (Eq. 12) or `"effect_stabilization"` (Eq. 13, window `L`).
+- [LPDiD](https://diff-diff.readthedocs.io/en/stable/api/lpdid.html): Dube, Girardi, Jorda & Taylor (2025) Local Projections DiD: per-horizon long-difference event study on clean controls (no negative weighting); variance- or equally-weighted ATT, premean differencing, pooled pre/post, fast. Absorbing by default; non-absorbing (reversible) treatment via `non_absorbing="first_entry"` (Eq. 12) or `"effect_stabilization"` (Eq. 13, window `L`). Complex-survey designs (pweight + stratified-PSU TSL SEs) on the default path via `fit(survey_design=...)`.
 - [BaconDecomposition](https://diff-diff.readthedocs.io/en/stable/api/bacon.html): Goodman-Bacon (2021) decomposition for diagnosing TWFE bias in staggered settings
 
 ## Diagnostics and Sensitivity Analysis