Finite Population Causal Effects
- Finite population causal effects are estimands derived from a fixed set of units where randomness arises solely from the treatment assignment mechanism.
- Methodologies include nonparametric, Bayesian, and design-based estimation techniques, employing difference-in-means and Horvitz-Thompson estimators for unbiased inference.
- Applications span randomized experiments, panel and factorial designs, providing robust policy insights without relying on superpopulation assumptions.
Finite population causal effects refer to causal estimands, inferential procedures, and identification strategies where all randomness arises from the assignment mechanism within a fixed, non-superpopulation collection of units, and inference targets factual contrasts or distributions of causal effects across the specific finite population at hand. The finite population perspective underpins design-based inference used in randomized experiments, panel designs, factorial studies, instrumental variable settings, and compliance-adjusted analyses. This approach contrasts with superpopulation-based inference, which presumes units are sampled from an underlying infinite population and employs stochastic models for unobserved potential outcomes. Modern finite population causal inference theory encompasses identification, nonparametric and Bayesian estimation, central limit theorems, variance estimation, extensions to dynamic and factorial settings, and exact methods for both sharp and weak null hypotheses.
1. Formal Setup and Core Estimands
The finite population framework fixes a collection of units with associated potential outcomes for each possible treatment or path. For a binary treatment, each unit has under SUTVA assumptions (Ding et al., 2017). The main estimand is the population-average causal effect: For multi-arm, factorial, or longitudinal designs, potential outcomes generalize to for in the full treatment grid, or for dynamic panels (Bojinov et al., 2020, Lu, 2016). Finite population estimands include:
- Average treatment effects (ATE)
- Conditional mean/median/mode contrasts (particularly for non-numeric ordinal outcomes (Volfovsky et al., 2015))
- Product moments and higher-order moments of individual treatment effects (Kawakami et al., 8 May 2025)
- Factorial effect contrasts with specified contrast weights (Mukerjee et al., 2016)
- Dynamic/panel lagged effects quantifying alternative paths (Bojinov et al., 2020)
Identification in this setting does not require distributional assumptions over an underlying superpopulation but relies on the properties induced by randomization and exclusion restrictions.
2. Identification and Nonparametric Estimation Procedures
Finite-population causal effects are identifiable under the known randomization mechanism, provided positivity (every unit has nonzero probability of assignment to each treatment). The difference-in-means estimator is unbiased for the finite-population average causal effect (Ding et al., 2017).
For multi-arm or factorial settings, contrasts of Horvitz-Thompson or regression-adjusted estimators identify population-level effects (Lu, 2016, Mukerjee et al., 2016). In panel experiments, lag-p dynamic effects are unbiasedly estimable by adapted Horvitz-Thompson formulas utilizing assignment-path probabilities (Bojinov et al., 2020, Picchetti, 23 Jan 2026). For ordinal non-numeric outcomes, joint and conditional joint distributions () serve as the primary estimands, necessitating imputation or model-based procedures (Volfovsky et al., 2015).
Estimation of higher-order moments and product moments of causal effects requires knowledge or bounds on the joint distribution of potential outcomes. Under randomization with monotonicity, plug-in and integral-based estimators are consistent; when monotonicity fails, sharp Fréchet-Hoeffding bounds yield partial identification (Kawakami et al., 8 May 2025).
3. Asymptotic Theory and Finite Population Central Limit Theorems
Finite-population central limit theorems provide normal approximations to causal estimators as population size increases, assuming boundedness and no dominating unit effects (Li et al., 2016, Pashley, 2019). For treatment arms and vector contrasts: where is the limiting finite-population covariance. Variance formulas reflect randomization and account for non-additivity: CLTs extend to regression adjustment, instrumental variables, rerandomization, clustering, and factorial designs; generalizations apply to nonlinear estimands under the finite-population delta method (Pashley, 2019). In panel and dynamic designs, martingale-difference CLTs hold for time-averaged, unit-averaged, and aggregate causal effect estimators (Bojinov et al., 2020).
4. Inference, Testing, and Variance Estimation
Randomization inference procedures in finite populations distinguish Fisher-style sharp nulls (no individual effect) from Neyman-style weak nulls (mean effect zero) (Cohen et al., 2020). Exact permutation tests are valid for the sharp null. For weak nulls, prepivoted statistics using Gaussian push-forward of CLT-based p-values achieve asymptotic conservativeness while retaining exactness under the sharp null:
| Null type | Test type | Valid for finite | Valid for weak null () |
|---|---|---|---|
| Sharp | Permutation | Yes | Yes |
| Weak | CLT/variance | Conservative | Yes |
| Weak | Prepivoted randomization | Yes | Yes |
Variance is typically estimated by plug-in Neyman formulas or bounds derived from assignment mechanism and observed outcomes. Additivity (constant unit-level treatment effects) yields unbiased variance estimators; more general settings require conservative adjustments or minimax bias analysis (Mukerjee et al., 2016). In dynamic and factorial settings with imperfect compliance, Horvitz-Thompson estimators and Bloom-style intervals yield reliable coverage at moderate compliance rates (Picchetti, 23 Jan 2026).
5. Extensions: Ordinal Outcomes, Factorial, Panel, and Compliance Structures
For ordinal outcomes lacking a numeric scale, classical mean-based estimands are ill-defined. Finite-population conditional distributions of treatment outcomes given baseline levels provide interpretable, multidimensional summaries (Volfovsky et al., 2015). Bayesian estimation with a rank likelihood and ordered-probit latent variable models allows imputation of joint science tables and credible intervals for conditional medians/modes.
Factorial designs and dynamic panel experiments with imperfect compliance yield nonparametric estimands for causal effects along specific treatment sequences, with estimation based on adapted Horvitz-Thompson-type weights and difference operators. These methods outperform standard regression-based estimators (e.g., 2SLS or fixed-effects) in the presence of treatment path dependence, non-additivity, and serial correlation in assignment vectors (Bojinov et al., 2020, Picchetti, 23 Jan 2026).
6. Practical Implementation and Policy Interpretation
Finite-population moments (variance, skewness, kurtosis, covariance) of causal effects illuminate effect heterogeneity, tail risk, and policy trade-offs, especially in small experimental samples and real-world interventions (Kawakami et al., 8 May 2025). Scale-free summaries for ordinal outcomes (median shift, stochastic dominance) provide actionable metrics in education and social sciences (Volfovsky et al., 2015). Panel and factorial settings with compliance extend interpretation to sequence-specific and subgroup responses (Picchetti, 23 Jan 2026).
Finite-population inference is exact with respect to the design and does not rely on postulated distributions or superpopulation properties, yielding rigorous internal validity. However, identification of features such as monotonicity, while formally attainable, may be statistically underpowered or practically untestable at finite sample sizes (Chen et al., 31 Dec 2025).
7. Comparison with and Bridging to Superpopulation Inference
The finite-population approach treats all potential outcomes as fixed; randomness stems solely from the assignment mechanism. Superpopulation models assume units are IID from a population, which changes the variance decomposition and interpretation (Ding et al., 2017). Notably, Neyman-type variance estimators based on observed sample variances are automatically conservative for finite-population targets, even though originally derived under superpopulation assumptions. Completeness arguments bridge the finite and superpopulation paradigms and justify the practical use of design-based variance estimators in experiments, blocking, clustering, or factorial designs.
References
- Volfovsky, A. et al., "Causal inference for ordinal outcomes" (Volfovsky et al., 2015)
- Li, X. et al., "Note on the Delta Method for Finite Population Inference..." (Pashley, 2019)
- Lu, S., Ding, P., "Covariate adjustment in randomization-based causal inference for 2K factorial designs" (Lu, 2016)
- Athey, S. et al., "Panel Experiments and Dynamic Causal Effects..." (Bojinov et al., 2020)
- Kawakami, E. & Tian, J., "Moments of Causal Effects" (Kawakami et al., 8 May 2025)
- Li, X., Ding, P., "General forms of finite population central limit theorems..." (Li et al., 2016)
- Mukerjee, R. et al., "Causal Inference in Rebuilding and Extending..." (Mukerjee et al., 2016)
- Höltgen, W., Williamson, J., "Causal modelling without introducing counterfactuals..." (Höltgen et al., 2024)
- Kline, P., "Testing Monotonicity in a Finite Population" (Chen et al., 31 Dec 2025)
- Ding, P. et al., "Bridging Finite and Super Population Causal Inference" (Ding et al., 2017)
- Cohen, P. L., "Gaussian Prepivoting for Finite Population Causal Inference" (Cohen et al., 2020)
- Rashkovskii, O. et al., "Estimating Causal Effects in Gaussian Linear SCMs with Finite Data" (Maiti et al., 8 Jan 2026)
- Athey, S., "Finite Population Inference for Factorial Designs and Panel Experiments with Imperfect Compliance" (Picchetti, 23 Jan 2026)