Fuzzy Regression Discontinuity Design

Updated 16 November 2025

Fuzzy Regression Discontinuity Design is a quasi-experimental method that estimates local treatment effects where treatment compliance is probabilistic at an observed cutoff.
It uses local polynomial regressions and ratio estimators, including regularized methods like the λ-class estimator, to address finite-sample issues and heavy-tailed estimation distributions.
Optimal bandwidth selection, bias-correction routines, and robust inferential techniques are key to mitigating finite-sample challenges and ensuring credible causal estimates.

Fuzzy Regression Discontinuity Design (FRDD) is a quasi-experimental framework for causal inference in the presence of imperfect compliance with assignment rules at an observed threshold or cutoff. Contrary to sharp designs, in fuzzy RDD actual treatment receipt is discontinuous but less than deterministic at the cutoff, necessitating local instrumental variable techniques, ratio estimands, and sophisticated inferential machinery to recover credible treatment effects. Modern developments have established FRDD as a canonical approach in economics, education, and biomedical research, providing identification of complier average treatment effects under carefully delineated assumptions. This article surveys the key principles, identification strategies, inferential methods, finite-sample challenges, and contemporary innovations within FRDD.

1. Formal Definition and Core Identification Assumptions

In a fuzzy RD design, the running (forcing) variable $X_i\in\mathbb{R}$ selects individuals for treatment when $X_i\geq c$ , but actual receipt $D_i\in\{0,1\}$ (or $D_i\in\mathbb{R}$ for continuous treatment) is probabilistic. The canonical estimand at the cutoff $c$ is the local average treatment effect (LATE): $\tau = \frac{\lim_{x\downarrow c} E[Y_i|X_i=x] - \lim_{x\uparrow c} E[Y_i|X_i=x]} {\lim_{x\downarrow c} P(D_i=1|X_i=x) - \lim_{x\uparrow c} P(D_i=1|X_i=x)}$ Identification of $\tau$ requires continuity of the expected potential outcome functions $E[Y_i(t)|X_i=c]$ in $X$ for each $t$ , positivity of treatment assignment probability in a neighborhood of $c$ , and the instrumental variable property that $Y \perp Z\,|\,X,T,C$ where $Z=1\{X\geq c\}$ is the assignment indicator (Constantinou et al., 2016). Monotonicity (no defiers) is often invoked to guarantee identification of the complier LATE.

2. Estimation Procedures and Ratio Estimators

Classically, local polynomial (typically linear, $p=1$ ) regressions are fit separately on either side of the cutoff for both the outcome and the treatment indicator:

Estimate one-sided limits $\hat\mu_{+}(c),\hat\mu_{-}(c)$ for $Y$ and $\hat\pi_{+}(c),\hat\pi_{-}(c)$ for $D$ .
The standard fuzzy RD estimator (FRD), $\hat{\tau} = (\hat\mu_{+}(c)-\hat\mu_{-}(c))/(\hat\pi_{+}(c)-\hat\pi_{-}(c))$ , is implemented as a ratio of jumps.

However, Lane (Lane, 5 Nov 2025) establishes that the standard FRD estimator does not possess finite moments of any order in finite samples, due to the denominator admitting arbitrarily small realizations, yielding a distribution "fatter-than-Cauchy." A one-parameter regularized estimator—termed the " $\lambda$ -class" (Editor’s term)—is defined: $\hat{\tau}_{\lambda} = \frac{\lambda\,\widetilde\Gamma\,\hat{\tau}_Y\,\hat{\tau}_D + (1-\lambda)\,\tilde D'M_{\tilde V}\,\tilde Y} {\lambda\,\widetilde\Gamma\,(\hat{\tau}_D)^2 + (1-\lambda)\,\tilde D'M_{\tilde V}\,\tilde D}$ where $0\leq\lambda<1$ ensures finite moments and improved distributional stability for all $n$ and polynomial order.

3. Bandwidth Selection and Finite-Sample Behavior

Bandwidth selection critically influences finite-sample bias and variance in local polynomial FRDD. The mean-squared-error criterion for the fuzzy RD estimator must account for second-order bias terms. Arai and Ichimura (Arai et al., 2015) derive asymptotic formulas for bias and variance; their MMSE plug-in procedure selects two optimal bandwidths ( $h_+, h_-$ ) simultaneously, one for each side, by minimizing

$\text{MMSE}(h_+,h_-) = \frac{1}{\tau_D^2}[\phi_+h_+^2 - \phi_-h_-^2]^2 + \frac{v}{n f \tau_D^2}[\omega_+/h_+ + \omega_-/h_-]$

where $\phi_j$ and $\omega_j$ are functionals of derivatives and conditional variances. Their method consistently dominates the single-bandwidth approach of Imbens–Kalyanaraman, especially in bias and RMSE for the FRDD estimand.

4. Inference, Bias Correction, and Robustness

Standard confidence intervals for FRD rely on delta-method asymptotics or bias-correction routines (e.g., Calonico–Cattaneo–Titiunik 2014), but coverage suffers when instruments are weak or the running variable is discrete. Noack and Rothe (Noack et al., 2019) introduce bias-aware Anderson–Rubin–style confidence sets $\mathcal{C}_{ar}^\alpha$ , formed by inverting moment tests on bias-controlled local linear jump estimators. Their approach guarantees uniform coverage under both continuous and discrete $X$ , holds for small discontinuities, and is robust to "donut" designs and weak identification.

Uniform inference for quantile treatment effects (QTE) is addressed by (Chiang et al., 2017), showing that robust local quadratic estimation and multiplier bootstrap bands provide correct uniform coverage for QTEs $\tau(\tau)$ , where

$\tau(\tau) = Q_{Y^1|C}(\tau) - Q_{Y^0|C}(\tau)$

These methods accommodate bandwidths as large as MSE-optimal and apply to both mean and quantile FRDD.

5. Generalizations: Multiple Cutoffs, Continuous Treatment, and Nonparametric Models

Modern FRDDs accommodate complex eligibility schedules or continuous treatments. In the presence of multiple thresholds, Bertanha (Bertanha, 2021) proves that nonparametric identification of heterogenous treatment effects in fuzzy multi-cutoff designs is impossible due to insufficient dimensions of jump observables. Under parametric heterogeneity, a GMM/GLS estimator aggregates moments across cutoffs: $\hat{\theta} = (W'\Sigma^{-1}W)^{-1}\,W'\Sigma^{-1}\,B$ yielding asymptotically normal and efficient ATE estimates for any counterfactual policy.

For continuous $D$ , (Xie, 2022) demonstrates identification of nonlinear and nonseparable structural functions at the cutoff under monotonicity and rank similarity conditions. Semiparametric estimation (three-step procedure: quantile map estimation, CDF estimation, and minimum-residual criterion) achieves the optimal $n^{-2/5}$ rate.

Bayesian nonparametric models (Karabatsos & Walker (Karabatsos et al., 2013)), as well as hierarchical Gaussian process estimators (Wu, 2021), now provide fully probabilistic inference for FRDD. The former uses an infinite-mixture model allowing uncertainty quantification for means, quantiles, and densities, while the latter leverages deep Bayesian machine-learning techniques to adapt to nonstationarity and provide succinct credible intervals and smooth derivatives.

6. Alternative Identification Strategies and Local Randomization

Alternative frameworks exploit local randomization near the cutoff (Branson & Mealli (Branson et al., 2018)), under which the assignment mechanism in a short bandwidth window is treated as randomized. Causal effects are then estimated via experimental-compliance estimators: $\hat{\tau}_h^c = \frac{\widehat{ITT}_Y^{(h)}}{\widehat{ITT}_W^{(h)}}$ Block or complete randomization methodologies yield intentional robustness to model misspecification and permit fine-grained sensitivity analysis by varying assignment models and bandwidth windows.

With multiple treatments jumping at the cutoff, "fuzzy difference-in-discontinuities" estimators (Galindo-Silva et al., 2018) isolate the effect of interest by differencing local treatment effects from overlapping policy regimes. This approach depends critically on an equal-jump assumption for compliance rates across treatments and is subject to nontrivial bias if this fails.

7. Empirical Performance and Practical Recommendations

Extensive Monte Carlo simulations (Lane, 5 Nov 2025, Arai et al., 2015, Noack et al., 2019, Wu, 2021) consistently demonstrate that:

The standard FRD estimator is not recommended for small samples or weak discontinuities due to infinite moments and heavy tails.
The $\lambda$ -class estimator with $\lambda=1-4/(n_h-2(p+1))$ markedly reduces median bias, median absolute deviation, and RMSE; its CIs are shorter and more stable than standard bias-corrected or AR intervals.
Hierarchical GP and Bayesian nonparametric models yield competitive coverage and bias control in finite samples, especially for nonlinear or discontinuous running variables.
Optimal two-sided bandwidth selection (MMSE-f method) achieves improved finite-sample performance for both bias and RMSE.
Sensitivity analyses exploiting local randomization or block randomization near the cutoff enable credible causal inference across diverse real-world settings (college dropout, class size, health outcomes).

A plausible implication is that computational simplicity and robust finite-sample stability favor regularized and bias-aware estimation/inference routines over the legacy ratio-of-jumps approach, particularly in modern high-dimensional or weak compliance FRDD applications.