Papers
Topics
Authors
Recent
Search
2000 character limit reached

Synthetic Difference-in-Differences (SDID)

Updated 4 February 2026
  • Synthetic Difference-in-Differences (SDID) is a causal inference method that integrates DiD and Synthetic Control features to reliably estimate treatment effects in panel data.
  • It constructs unit and time weights to optimize pre-treatment balance, thereby reducing bias and enhancing precision under latent factor models.
  • SDID employs advanced inference techniques, including bootstrap and placebo tests, to deliver robust and valid confidence intervals.

Synthetic Difference-in-Differences (SDID) is an estimator for causal inference in panel and repeated cross-sectional data that integrates key features from both Difference-in-Differences (DiD) and Synthetic Control (SC) methodologies. By constructing both unit and time weights to optimize pre-treatment balance, SDID achieves robustness to latent confounding and offers substantial improvements in bias and precision over classical methods. It is consistent under general latent factor models and enables principled inference through a suite of bootstrap and placebo-based procedures.

1. Theoretical Formulation and Assumptions

SDID is built on the potential outcomes framework in the context of panel data. Let units i=1,,Ni = 1,\dots,N be observed over periods t=1,,Tt = 1,\dots,T, with potential outcomes Yit(0)Y_{it}(0) (untreated) and Yit(1)Y_{it}(1) (treated). The observed outcome is Yit=WitYit(1)+(1Wit)Yit(0)Y_{it} = W_{it} Y_{it}(1) + (1-W_{it}) Y_{it}(0), where WitW_{it} is the binary treatment indicator, typically assigned to certain units after a specified treatment adoption date T0T_0.

The central goal is to impute the post-treatment counterfactuals for treated units and periods, and thereby to estimate the average treatment effect on the treated (ATT):

ATT=1N1T1iTreatedt>T0[Yit(1)Yit(0)]\mathsf{ATT} = \frac{1}{N_1 T_1} \sum_{i \in \mathit{Treated}} \sum_{t > T_0} [ Y_{it}(1) - Y_{it}(0) ]

The untreated potential outcomes are assumed to satisfy an interactive fixed effects (latent factor) model:

Yit(0)=μ+αi+βt+γiνt+εitY_{it}(0) = \mu + \alpha_i + \beta_t + \gamma_i' \nu_t + \varepsilon_{it}

with γi\gamma_i and νt\nu_t being unobserved unit and time factors, respectively, and εit\varepsilon_{it} a mean-zero noise term (Arkhangelsky et al., 2018, Clarke et al., 2023).

Consistency of SDID requires (i) the latent factor model is well-approximated at low rank, (ii) donor (control) units span the latent space of treated units, and (iii) sufficient pre-treatment periods and control units to estimate balancing weights (Arkhangelsky et al., 2018, Morin, 2024). A “weighted parallel trends” assumption after adjustment is also required (Doudchenko et al., 2016).

2. Methodology and Estimator Construction

SDID employs a two-step weighting scheme: synthetic control-style unit weights and “synthetic time” weights. The resulting ATT estimator is obtained by a weighted two-way fixed effects regression.

a) Unit Weights (Synthetic Control Step)

For N0N_0 never-treated units (controls), nonnegative weights ω\omega are chosen to match the pre-treatment mean outcome trajectory of the treated cohort. Specifically,

(ω0,ω)=argminω0R,ω0,iωi=1t=1T0[ω0+i=1N0ωiYitYˉttr]2+ζ2T0ω22(\omega_0^*, \omega^*) = \arg\min_{\omega_0 \in \mathbb{R},\, \omega \geq 0,\, \sum_i \omega_i = 1} \sum_{t=1}^{T_0} \left[ \omega_0 + \sum_{i=1}^{N_0} \omega_i Y_{it} - \bar{Y}^\mathrm{tr}_t \right]^2 + \zeta^2 T_0 \|\omega\|_2^2

where Yˉttr\bar{Y}^\mathrm{tr}_t averages the treated units' outcomes in period tt and ζ\zeta is a regularization parameter (Arkhangelsky et al., 2018, Quispe et al., 2024).

b) Time Weights ("Synthetic Time" Step)

For T0T_0 pre-treatment periods, time weights λ\lambda are selected by matching the post-treatment average of controls:

(λ0,λ)=argminλ0R,λ0,tλt=1i=1N0[λ0+t=1T0λtYitYˉict]2+ζ2N0λ22(\lambda_0^*, \lambda^*) = \arg\min_{\lambda_0 \in \mathbb{R},\, \lambda \geq 0,\, \sum_t \lambda_t = 1} \sum_{i=1}^{N_0} \left[ \lambda_0 + \sum_{t=1}^{T_0} \lambda_t Y_{it} - \bar{Y}^\mathrm{ct}_i \right]^2 + \zeta^2 N_0 \|\lambda\|_2^2

where Yˉict\bar{Y}^\mathrm{ct}_i is the post-treatment mean outcome for control unit ii (Arkhangelsky et al., 2018, Mirzaei, 2023).

c) Weighted Two-Way Fixed Effects Regression

With estimated unit and time weights, the ATT is recovered by solving:

minτ,μ,α,βi=1Nt=1Tωiλt[YitμαiβtτWit]2\min_{\tau, \mu, \alpha, \beta} \sum_{i=1}^{N} \sum_{t=1}^{T} \omega_i \lambda_t [ Y_{it} - \mu - \alpha_i - \beta_t - \tau W_{it}]^2

Equivalently, the estimator is a “double-difference” of post- versus pre-treatment outcomes, across treatment groups and periods, each reweighted to optimize pre-treatment balance (Arkhangelsky et al., 2018, Clarke et al., 2023).

d) Generalization and Relation to Existing Methods

  • If ωi1/N0\omega_i \equiv 1/N_0, λt1/T0\lambda_t \equiv 1/T_0: reduces to standard two-way FE DiD.
  • If αi0\alpha_i \equiv 0, λt1/T0\lambda_t \equiv 1/T_0: reduces to classical SC estimator.
  • SDID is thus a superset of both DiD and SC approaches (Doudchenko et al., 2016).

3. Extensions: Staggered Adoption, Event Studies, and Repeated Cross Sections

a) Staggered Treatment and Event-Study Estimation

For staggered adoption (treatment at multiple points in time), cohort-by-cohort SDID is implemented. For each adoption cohort aa, weights (ωa,λa)(\omega^a, \lambda^a) are estimated, and cohort-specific ATTs are aggregated:

ATT^=aTpostaTpostτ^a\widehat{ATT} = \sum_{a} \frac{T^{a}_{post}}{T_{post}} \hat{\tau}_a

Event-study (dynamic effect) estimators decompose SDID into period-by-period effects, facilitating the recovery of a full event-time response curve (Ciccia, 2024).

b) Sequential SDID for Panel Event Studies

The Sequential SDID estimator iteratively applies SDID imputation to aggregated cohort data, treating prior cohort effects as known features for later cohorts. The sequential estimator is asymptotically equivalent to an oracle OLS estimator in a linear interactive fixed effects model:

τ^a,kSSDiD=Ya,a+kj>aωj(a,k)Yj,a+kl<a+kλl(a,k)[Ya,lj>aωj(a,k)Yj,l]\hat{\tau}_{a,k}^{SSDiD} = Y_{a,a+k} - \sum_{j > a} \omega^{(a,k)}_j Y_{j,a+k} - \sum_{l < a+k} \lambda^{(a,k)}_l [Y_{a,l} - \sum_{j > a} \omega^{(a,k)}_j Y_{j,l}]

with weights (ω(a,k),λ(a,k))(\omega^{(a,k)}, \lambda^{(a,k)}) determined by pre-treatment minimization problems (Arkhangelsky et al., 2024).

c) Repeated Cross-Sectional Data

RC-SDID adapts SDID for repeated cross-sectional settings where group-period cells have unequal sizes. After group-level aggregation, an additional 1/Nkt1/N_{kt} cross-sectional weight is applied per individual, preserving unbiasedness under the latent factor model even with heteroskedastic cell sizes (Morin, 2024, Sun et al., 14 Mar 2025).

4. Identification, Robustness, and Semiparametric Properties

SDID identification is doubly robust in the sense of (Sun et al., 14 Mar 2025): the estimator is consistent if either parallel trends holds (as in DiD) or the synthetic control construction is valid (as in SC). The doubly robust moment function

φ(Si;ms,p,w;π1)=1π1{G1i[ΔYims(Xi)]g=2NG+1wg(Xi)Ggi[ΔYims(Xi)]p1(Xi)pg(Xi)}\varphi(S_i;m_s,p,w;\pi_1) = \frac{1}{\pi_1} \{ G_{1i} [ \Delta Y_i - m_s(X_i) ] - \sum_{g=2}^{NG+1} w_g(X_i) G_{gi} [ \Delta Y_i - m_s(X_i) ] \frac{p_1(X_i)}{p_g(X_i)} \}

identifies the ATT under either assumption.

Orthogonality (Neyman orthogonality) holds under parallel trends, making the estimator amenable to semiparametric/machine learning adjustment for high-dimensional covariates, while non-orthogonality under SC requires accounting for the first-stage estimation error.

5. Inference Procedures

Multiple inference techniques ensure valid confidence intervals for SDID:

6. Empirical and Simulation Performance

Simulation studies consistently demonstrate that SDID outperforms standard DiD and SC estimators in mean-squared prediction error, bias, and coverage under interactive fixed effects (latent factor) models—especially when parallel trends are violated or the treated units are outside the convex hull of the controls (Arkhangelsky et al., 2018, Doudchenko et al., 2016, Morin, 2024, Arkhangelsky et al., 2024). In canonical policy evaluation applications (e.g., California smoking, German reunification, Mariel Boatlift), SDID displays lower bias and variance as well as credible standard errors (Doudchenko et al., 2016).

Empirical applications—including the impact of ChatGPT bans on software development productivity, and the Alaska minimum wage increase on family income—adopt SDID for estimation and inference, finding stable and interpretable effects with robust uncertainty quantification (Quispe et al., 2024, Sun et al., 14 Mar 2025).

7. Implementation and Computational Considerations

SDID is implemented in standard statistical software via open-source packages (e.g., R: synthdid; Stata: sdid, sdid_event) (Clarke et al., 2023, Ciccia, 2024). The estimation procedure entails solving regularized quadratic programs for unit and time weights under simplex constraints. Parameter regularization is tuned via closed-form or cross-validation to balance bias and variance. For repeated cross-sectional data, standard QP solvers accommodate the cell-weight adjustments (Morin, 2024).

Extensions include the handling of covariates (via projected or optimized approaches), multi-period and cohort-specific treatment adoption, event-study effect decomposition, and machine learning-based flexible adjustment for nuisance functions (Sun et al., 14 Mar 2025, Clarke et al., 2023, Ciccia, 2024).


References: Key methodological contributions are synthesized from (Arkhangelsky et al., 2018, Clarke et al., 2023, Sun et al., 14 Mar 2025, Doudchenko et al., 2016, Arkhangelsky et al., 2024, Ciccia, 2024, Morin, 2024), and (Quispe et al., 2024). These works formalize SDID’s properties, extensions, and implementation.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Synthetic Difference-in-Differences (SDID).