Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions diff_diff/guides/llms-full.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2075,7 +2075,7 @@ clear_cache()

## Survey Support

All estimators accept an optional `survey_design` parameter in `fit()`. Pass a `SurveyDesign` object to get design-based variance estimation.
Most estimators accept an optional `survey_design` parameter in `fit()` (`SyntheticControl` rejects it as not yet supported); depth of support varies by estimator - see the compatibility matrix in `docs/choosing_estimator.rst` (Survey Design Support). Pass a `SurveyDesign` object to get design-based variance estimation.

```python
from diff_diff import SurveyDesign, CallawaySantAnna
Expand Down Expand Up @@ -2120,7 +2120,7 @@ sd_female, data_female = sd.subpopulation(data, mask=lambda df: df['sex'] == 'F'

**Key features:**
- Taylor Series Linearization (TSL) variance with strata + PSU + FPC
- Replicate weight variance: BRR, Fay's BRR, JK1, JKn, SDR (13 of 16 estimators, including dCDH)
- Replicate weight variance: BRR, Fay's BRR, JK1, JKn, SDR (13 of 20 estimators, including dCDH)
- Survey-aware bootstrap: multiplier at PSU (Hall-Mammen wild; dCDH, staggered) or Rao-Wu rescaled (SunAbraham, SyntheticDiD, TROP). SyntheticDiD bootstrap composes Rao-Wu rescaled per-draw weights with the weighted Frank-Wolfe variant of `_sc_weight_fw` (PR #355): each draw solves `min ||A·diag(rw)·ω - b||² + ζ²·Σ rw_i ω_i²` and composes `ω_eff = rw·ω/Σ(rw·ω)` for the SDID estimator. Pweight-only fits use constant `rw = w_control`; full designs use Rao-Wu. SDID's placebo (stratified permutation + weighted FW) and jackknife (PSU-level LOO with stratum aggregation, Rust & Rao 1996) paths also support pweight-only and full strata/PSU/FPC designs
- DEFF diagnostics, subpopulation analysis, weight trimming (`trim_weights`)
- Repeated cross-sections: `CallawaySantAnna(panel=False)`
Expand Down
4 changes: 2 additions & 2 deletions diff_diff/guides/llms.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

> A Python library for Difference-in-Differences (DiD) causal inference analysis. Provides sklearn-like estimators with statsmodels-style summary output for econometric analysis.

diff-diff offers 19 estimators covering basic 2x2 DiD, modern staggered adoption methods, reversible (non-absorbing) treatments, advanced panel estimators, nonlinear models, and diagnostic tools. It supports robust and cluster-robust standard errors, wild cluster bootstrap, formula and column-name interfaces, fixed effects (dummy and absorbed), complex survey designs (strata/PSU/FPC, replicate weights, design-based variance), and publication-ready output. The optional Rust backend accelerates compute-intensive estimators like Synthetic DiD and TROP.
diff-diff offers 20 estimators covering basic 2x2 DiD, modern staggered adoption methods, reversible (non-absorbing) treatments, advanced panel estimators, nonlinear models, and diagnostic tools. It supports robust and cluster-robust standard errors, wild cluster bootstrap, formula and column-name interfaces, fixed effects (dummy and absorbed), complex survey designs (strata/PSU/FPC, replicate weights, design-based variance), and publication-ready output. The optional Rust backend accelerates compute-intensive estimators like Synthetic DiD and TROP.

- Install: `pip install diff-diff`
- License: MIT
Expand Down Expand Up @@ -104,7 +104,7 @@ Full practitioner guide: call `diff_diff.get_llm_guide("practitioner")`

## Survey Support

All estimators accept an optional `survey_design` parameter. Pass a `SurveyDesign` object to get design-based variance estimation:
Most estimators accept an optional `survey_design` parameter (`SyntheticControl` does not yet support it); coverage and weight types vary by estimator - see the [Survey Design Support matrix](https://diff-diff.readthedocs.io/en/stable/choosing_estimator.html#survey-design-support). Pass a `SurveyDesign` object to get design-based variance estimation:

- **Design elements**: strata, PSU, FPC, weight types (pweight/fweight/aweight), lonely PSU handling, nest
- **Variance methods**: Taylor Series Linearization (TSL), replicate weights (BRR/Fay/JK1/JKn/SDR), survey-aware bootstrap
Expand Down
45 changes: 29 additions & 16 deletions docs/choosing_estimator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -771,9 +771,10 @@ If you're unsure which estimator to use:
Survey Design Support
---------------------

All estimators accept an optional ``survey_design`` parameter in ``fit()``.
Most estimators support an optional ``survey_design`` parameter in ``fit()``
(``SyntheticControl`` accepts the parameter but raises ``NotImplementedError``).
Pass a :class:`~diff_diff.SurveyDesign` object to get design-based variance
estimation. The depth of support varies by estimator:
estimation. The depth of support varies by estimator and variance method:

.. note::

Expand Down Expand Up @@ -820,7 +821,7 @@ estimation. The depth of support varies by estimator:
* - ``ChaisemartinDHaultfoeuille``
- pweight only
- Full (TSL)
- --
- Full (analytical)
- Group-level (warning)
* - ``TripleDifference``
- pweight only
Expand Down Expand Up @@ -869,9 +870,14 @@ estimation. The depth of support varies by estimator:
- Multiplier at PSU
* - ``SyntheticDiD``
- pweight only
- Via bootstrap
- Full (method-specific)
- --
- Hybrid pairs-bootstrap + Rao-Wu rescaled (bootstrap only)
* - ``SyntheticControl``
- --
- --
- --
- --
* - ``TROP``
- pweight only
- Via bootstrap
Expand All @@ -887,6 +893,11 @@ estimation. The depth of support varies by estimator:
- Full (Binder TSL)
- --
- --
* - ``SpilloverDiD``
- pweight only
- Full (Binder TSL + Conley)
- --
- --
* - ``BaconDecomposition``
- Diagnostic
- Diagnostic
Expand All @@ -897,24 +908,26 @@ estimation. The depth of support varies by estimator:

- **Full**: All weight types (pweight/fweight/aweight) + strata/PSU/FPC + Taylor Series Linearization variance
- **Full (pweight only)**: Full TSL with strata/PSU/FPC, but only ``pweight`` accepted (``fweight``/``aweight`` rejected because composition changes weight semantics)
- **Via bootstrap**: Strata/PSU/FPC supported only with bootstrap variance. ``TROP`` uses bootstrap by default. ``SyntheticDiD`` supports strata/PSU/FPC on ``variance_method='bootstrap'`` via a hybrid pairs-bootstrap + Rao-Wu rescaling composition (see the ``Note (survey + bootstrap composition)`` in REGISTRY.md §SyntheticDiD); ``placebo`` and ``jackknife`` remain pweight-only.
- **Via bootstrap**: Strata/PSU/FPC supported only with bootstrap variance (``TROP``, which uses bootstrap by default)
- **Full (method-specific)**: ``SyntheticDiD`` supports strata/PSU/FPC on all three variance methods via method-specific survey paths — see the note below and the ``Note (survey support matrix)`` in REGISTRY.md §SyntheticDiD
- **pweight only** (Weights column): Only ``pweight`` accepted; ``fweight``/``aweight`` raise an error
- **Diagnostic**: Weighted descriptive statistics only (no inference)
- **--**: Not supported

.. note::

``SyntheticDiD`` supports survey designs on ``variance_method='bootstrap'``
— both pweight-only and full strata/PSU/FPC — via a hybrid pairs-bootstrap
composed with per-draw Rao-Wu rescaled weights fed into a weighted
Frank-Wolfe re-estimation of ω and λ. See the
``Note (survey + bootstrap composition)`` in REGISTRY.md §SyntheticDiD
for the objective form and argmin-set caveat.

``variance_method='placebo'`` and ``variance_method='jackknife'`` remain
pweight-only — composing placebo permutations / leave-one-out with
Rao-Wu rescaling under the weighted objective is a separate derivation
(tracked in ``TODO.md``).
``SyntheticDiD`` supports survey designs — both pweight-only and full
strata/PSU/FPC — on all three variance methods, each via a
method-specific path: ``bootstrap`` composes a hybrid pairs-bootstrap
with per-draw Rao-Wu rescaled weights fed into a weighted Frank-Wolfe
re-estimation of ω and λ; ``placebo`` switches to stratified
permutation (pseudo-treated draws within strata containing treated
units) with weighted-FW re-estimation, and FPC is a documented no-op
for the permutation test; ``jackknife`` switches to PSU-level
leave-one-out with stratum aggregation (Rust & Rao 1996).
Replicate-weight designs are rejected. See the
``Note (survey support matrix)`` and the per-method composition notes
in REGISTRY.md §SyntheticDiD.

For the full walkthrough with code examples, see the
`survey tutorial <https://github.com/igerber/diff-diff/blob/main/docs/tutorials/16_survey_did.ipynb>`_.
Expand Down
14 changes: 11 additions & 3 deletions docs/methodology/REGISTRY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4576,7 +4576,7 @@ variance from the distribution of replicate estimates.
design structure is fixed and dropped replicates contribute zero to the
sum without changing the scale. Survey df uses `n_valid - 1` for
t-based inference.
- **Note:** Replicate-weight support matrix (12 of 15 public estimators):
- **Note:** Replicate-weight support matrix (13 of 20 public estimators):
- **Supported**: CallawaySantAnna (reg/ipw/dr with or without covariates,
no bootstrap; IF-based replicate variance is covariate-agnostic),
ContinuousDiD (no bootstrap), EfficientDiD (no bootstrap),
Expand All @@ -4587,9 +4587,17 @@ variance from the distribution of replicate estimates.
TwoWayFixedEffects (estimator-level refit with within-transformation),
SunAbraham (estimator-level refit, replaces `vcov_cohort`),
StackedDiD (estimator-level refit with Q-weight composition),
ImputationDiD (two-stage refit), TwoStageDiD (two-stage refit)
ImputationDiD (two-stage refit), TwoStageDiD (two-stage refit),
ChaisemartinDHaultfoeuille (closed-form cell-collapse replicate ATT,
multi-horizon and placebo paths; replicate + `n_bootstrap > 0` rejected
— see the ChaisemartinDHaultfoeuille Notes for the allocator contract)
- **Rejected with NotImplementedError**: SyntheticDiD, TROP
(bootstrap-based variance), BaconDecomposition (diagnostic only)
(bootstrap-based variance), WooldridgeDiD, LPDiD, SpilloverDiD,
HeterogeneousAdoptionDiD (TSL-only survey paths; replicate designs
rejected at `fit()`), SyntheticControl (rejects `survey_design`
entirely)
- **BaconDecomposition** is diagnostic-only — outside the 20-estimator
count — and likewise rejects replicate designs
- Estimators with replicate support reject replicate + bootstrap
(replicate weights provide analytical variance)
- **Note:** When invalid replicates are dropped in `compute_replicate_vcov`
Expand Down
5 changes: 3 additions & 2 deletions docs/practitioner_decision_tree.rst
Original file line number Diff line number Diff line change
Expand Up @@ -463,9 +463,10 @@ At a Glance
What About the Other Estimators?
--------------------------------

diff-diff has 17 estimators covering advanced scenarios: Sun-Abraham for
diff-diff has 20 estimators covering advanced scenarios: Sun-Abraham for
interaction-weighted estimation, Imputation DiD and Two-Stage DiD for alternative
staggered approaches, Stacked DiD, Efficient DiD, Triple Difference, TROP, and more.
staggered approaches, Local Projections DiD, Stacked DiD, Efficient DiD,
Triple Difference, TROP, and more.
The six scenarios above cover the most common business use cases.

For the full academic decision tree with all estimators, see :doc:`choosing_estimator`.
Expand Down
31 changes: 30 additions & 1 deletion paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ @misc{Gerber2026

@article{Abadie2010,
author = {Abadie, Alberto and Diamond, Alexis and Hainmueller, Jens},
title = {Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program},
title = {Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of {California's} Tobacco Control Program},
journal = {Journal of the American Statistical Association},
volume = {105},
number = {490},
Expand Down Expand Up @@ -249,3 +249,32 @@ @misc{deChaisemartin2026
primaryclass = {econ.EM},
doi = {10.48550/arXiv.2405.04465}
}

@article{Dube2025,
author = {Dube, Arindrajit and Girardi, Daniele and Jord{\`a}, {\`O}scar and Taylor, Alan M.},
title = {A Local Projections Approach to Difference-in-Differences},
journal = {Journal of Applied Econometrics},
volume = {40},
number = {5},
pages = {741--758},
year = {2025},
doi = {10.1002/jae.70000}
}

@article{Binder1983,
author = {Binder, David A.},
title = {On the Variances of Asymptotically Normal Estimators from Complex Surveys},
journal = {International Statistical Review},
volume = {51},
number = {3},
pages = {279--292},
year = {1983},
doi = {10.2307/1402588}
}

@misc{pyfixest,
author = {{The PyFixest Authors}},
title = {{pyfixest}: Fast High-Dimensional Fixed Effect Estimation in {Python}},
year = {2025},
url = {https://github.com/py-econometrics/pyfixest}
}
Loading
Loading