Synthetic Control Method (SCM)
- Synthetic Control Method (SCM) is a data-driven causal inference approach that constructs a weighted counterfactual from a pool of control units.
- It leverages pre-intervention outcome matching and convex combinations to estimate what would have happened in the absence of treatment.
- Extensions of SCM, including augmented methods and dynamic predictor weighting, improve fit diagnostics and address donor selection challenges.
The synthetic control method (SCM) is a data-driven approach for causal inference in comparative case studies, particularly when a single unit (individual, region, organization) is exposed to a treatment or intervention at a specific time, while a pool of other similar units remains untreated. SCM constructs a weighted average of the untreated units (“donors”) to serve as a counterfactual for the treated unit, leveraging longitudinal (panel) data to estimate what would have happened to the treated unit absent the intervention. This framework enables estimation of time-varying treatment effects and facilitates rigorous quantitative policy evaluation in observational settings, especially when randomized experiments are infeasible.
1. Formal Framework and Core Algorithm
SCM targets the problem of estimating the causal effect of an intervention on a single treated unit using a convex combination of donor (control) units matched on pre-intervention outcomes, and optionally covariates. Let denote the outcome for unit at time ; without loss of generality, unit undergoes treatment at time , while are controls. The objective is to select weights solving: subject to and .
The post-treatment counterfactual is then estimated as for (Sun, 26 Oct 2025). SCM can be implemented with arbitrary additional covariates or pre-treatment predictors via an analogous matching objective.
Key properties of this construction:
- Convexity: The synthetic control is restricted to the convex hull of control unit outcomes, avoiding extrapolation.
- No-interference: Standard SCM assumes no spillover of treatment across units (SUTVA).
- Exact or approximate pre-fit: SCM is predicated on achieving minimal imbalance in pre-intervention periods; imperfect pre-fit can lead to bias.
2. Inference, Uncertainty Quantification, and Test Statistics
SCM relies on outcome trajectories for observed donor and treated units. Inference on post-treatment effects is often carried out using permutation (“placebo-in-space”) tests:
- RMSPE ratio: Compares the ratio of post-to-pre-treatment root-mean-squared prediction error of the treated unit against the placebo distribution over the donor pool.
- Post-treatment gap: The mean deviation between observed and synthetic outcomes after treatment, tested against the corresponding placebo distribution.
Permutation -values are calculated as the empirical fraction of control units with more extreme test statistics than the treated unit (Sun, 26 Oct 2025). For example, in a real-world case, the RMSPE-ratio test gave , indicating statistical significance at the 10% level, whereas the post-treatment gap was not statistically significant at conventional levels ().
3. Practical Implementation and Tuning
SCM requires several choices regarding donor selection, predictor sets, and implementation details:
- Donor pool: Units matched on characteristics affecting pre-treatment trend and with complete outcome data. Practical implementations often prune donors with missing data or low correlation to the treated trajectory before treatment (Sun, 26 Oct 2025).
- Predictors and their weights: Standard SCM allows inclusion of non-outcome covariates (e.g., demographic, economic, structural features). Weighting matrices (denoted ) can emphasize more predictive pre-treatment periods or covariates. In the cited Altadena wildfire study, only lagged outcomes were used and pre-treatment periods were exponentially downweighted to emphasize recent months.
- Weight computation: The optimal is computed by quadratic programming. Modern implementations often use efficient convex solvers.
Fit diagnostics such as pre-treatment RMSPE indicate the quality of the synthetic control approximation. In high-quality SCM applications (e.g., Altadena), pre-intervention RMSPE can be as low as 0.61% of the treated unit's mean value.
4. Extensions and Variants
A range of SCM extensions address methodological challenges:
- Augmented SCM: Incorporates bias correction when exact pre-fit is unachievable via outcome modeling (e.g., penalized regression) (Ben-Michael et al., 2018).
- Penalized, Model-Averaged, and Covariate-balanced SCM: Regularization and model averaging mitigate overfitting and improve risk properties when the number of controls is large or pre-treatment fit is low (Pouliot et al., 2022).
- Staggered adoption: The framework can be generalized to settings where units receive treatment at different times, requiring partially pooled weights that minimize imbalance both for each unit individually and for the pooled treated average (Ben-Michael et al., 2019).
- Dynamic predictor weighting: Exponential or time-varying weights can emphasize recent outcomes or particularly informative covariates (Sun, 26 Oct 2025).
These variants retain the core structure of SCM—using a convex combination of controls as the counterfactual—but relax or adapt various modeling constraints and balance objectives.
5. Limitations, Identification, and Statistical Properties
The validity of SCM estimates is contingent on several identification and modeling assumptions:
- **Conv