Papers
Topics
Authors
Recent
Search
2000 character limit reached

ForestRiesz: Robust Semiparametric Inference

Updated 9 February 2026
  • ForestRiesz is a nonparametric method that uses the Riesz representation theorem and random forest machinery to construct debiased estimators of linear functionals in causal inference.
  • It constructs locally linear estimators by computing node-wise moments and employs cross-fitting to achieve double robustness and asymptotic normality.
  • The method avoids explicit inverse-propensity weighting, ensuring stability and efficiency in high-dimensional and biased sample selection settings.

ForestRiesz refers to a nonparametric method for automatic, debiased machine learning of linear functionals—particularly in causal inference settings involving high-dimensional or nonparametric regression functions, and in the presence of non-random treatment assignment and/or outcome selection. The ForestRiesz framework leverages the Riesz representation theorem and random forest machinery to construct a locally linear estimator of the Riesz representer, enabling efficient, robust, and stable semiparametric inference with automatic debiasing and double robustness properties (Chernozhukov et al., 2021, Bjelac et al., 13 Jan 2026).

1. Riesz Representation and the Debiasing Problem

Let W=(Y,Z)W = (Y, Z) denote the data, with g0(Z)=E[YZ]g_0(Z) = E[Y \mid Z] the regression function of interest. For a continuous linear functional %%%%2%%%%, there exists a unique Riesz representer α0(Z)\alpha_0(Z) such that

ψ(g)=E[m(W;g)]=E[α0(Z)g(Z)]\psi(g) = E[m(W; g)] = E[\alpha_0(Z) g(Z)]

for all square-integrable gg. The target estimand is θ0=ψ(g0)=E[α0(Z)g0(Z)]\theta_0 = \psi(g_0) = E[\alpha_0(Z) g_0(Z)]. In high-dimensional and nonparametric regimes, the naive plug-in estimator is subject to regularization-induced bias that can be of order n1/2n^{-1/2} or larger. The correction term

ψ(g^)+En[α0(Z){Yg^(Z)}]\psi(\widehat{g}) + E_n[\alpha_0(Z) \{Y - \widehat{g}(Z)\}]

(the "one-step" or "double-robust" correction) cancels leading bias and achieves asymptotically linear estimation if α0\alpha_0 is accurately estimated.

The Riesz representer α0\alpha_0 solves the variational problem

α0=argminαE[α(Z)22m(W;α)],\alpha_0 = \arg\min_\alpha E[\alpha(Z)^2 - 2 m(W; \alpha)],

where E[m(W;α)]=E[α0(Z)α(Z)]E[m(W; \alpha)] = E[\alpha_0(Z) \alpha(Z)]. This variational characterization is central for the automatic machine learning of α0\alpha_0 (Chernozhukov et al., 2021).

2. ForestRiesz Estimator: Construction and Algorithmic Principles

ForestRiesz models α(Z)ϕ(T,X)β(X)\alpha(Z) \approx \phi(T, X)^{\top} \beta(X), where ϕ(T,X)\phi(T, X) is a chosen feature map and β(X)\beta(X) is a locally linear coefficient function estimated nonparametrically.

Random Forest Implementation:

  • Node-wise local moments: For each node NN in the covariate space, compute:

J(N)=1NiNϕ(Zi)ϕ(Zi),M(N)=1NiNm(Wi;ϕ).J(N) = \frac{1}{|N|} \sum_{i \in N} \phi(Z_i) \phi(Z_i)^\top, \qquad M(N) = \frac{1}{|N|} \sum_{i \in N} m(W_i; \phi).

The local estimator is β^(N)=J(N)1M(N)\hat{\beta}(N) = J(N)^{-1} M(N).

  • Splitting criterion: Candidate splits are evaluated via local Riesz loss reduction (or equivalently, maximization of a negative Riesz loss criterion), ensuring balance and stability.
  • Forest weights and prediction: Forest similarity weights ωi(x)\omega_i(x) are computed by averaging indicator functions over leaves that contain xx in each tree,

ωi(x)=1Tt=1T1{it(x)}t(x).\omega_i(x) = \frac{1}{T} \sum_{t=1}^T \frac{\mathbf{1}\{i \in \ell_t(x)\}}{|\ell_t(x)|}.

The estimator α^(Z)\hat{\alpha}(Z) is given by locally weighted predictions of ϕ(Z)β^(X)\phi(Z)^{\top} \hat{\beta}(X).

  • Debiased functional estimation: The final estimator is

θ^=En[m(W;g^)+α^(Z){Yg^(Z)}].\widehat{\theta} = E_n[m(W; \widehat{g}) + \hat{\alpha}(Z) \{Y - \widehat{g}(Z)\}].

Cross-fitting is standard: the sample is partitioned, with ForestRiesz fitted on folds excluding the target data points to avoid overfitting and induce orthogonality (Chernozhukov et al., 2021, Bjelac et al., 13 Jan 2026).

3. Theoretical Guarantees and Statistical Properties

ForestRiesz achieves several desirable asymptotic properties under standard conditions:

  • n\sqrt{n}-consistency and asymptotic normality: Provided each of g^\hat{g} and α^\hat{\alpha} converge to their population targets at rates op(n1/4)o_p(n^{-1/4}), the ForestRiesz estimator satisfies

n(θ^θ0)dN(0,Var[ψ0(W)]),\sqrt{n} (\widehat{\theta} - \theta_0) \to_d N(0, \mathrm{Var}[\psi_0(W)]),

where the influence function is ψ0(W):=m(W;g0)+α0(Z)[Yg0(Z)]θ0\psi_0(W) := m(W; g_0) + \alpha_0(Z)[Y - g_0(Z)] - \theta_0.

  • Double robustness and Neyman orthogonality: The estimation bias satisfies

E[S(W;g,α)]=E[(α(Z)α0(Z))(g(Z)g0(Z))]E[S(W; g, \alpha)] = -E[(\alpha(Z) - \alpha_0(Z)) (g(Z) - g_0(Z))]

so that consistency obtains if either α=α0\alpha=\alpha_0 or g=g0g=g_0 is consistently estimated; the influence function is orthogonal to estimation errors in gg and α\alpha (Chernozhukov et al., 2021).

  • No reliance on explicit inverse-propensity weights: ForestRiesz circumvents instability from small probability weights by direct local moment matching within the forest, enhancing robustness compared to standard double machine learning approaches that rely on inverse propensity estimation (Bjelac et al., 13 Jan 2026).

The following table summarizes key asymptotic results:

Property Description Condition
n\sqrt{n}-consistency θ^\widehat{\theta} semiparametric efficiency g^g0α^α0=op(n1/2)\| \hat{g} - g_0 \| \cdot \| \hat{\alpha} - \alpha_0 \| = o_p(n^{-1/2})
Local CLT for β^(x)\hat{\beta}(x) n(β^(x)β0(x))\sqrt{n}(\hat{\beta}(x) - \beta_0(x)) normal Local identification, positive definite J(x)J(x)
Riesz consistency α^α022=Op(dn1)\|\hat{\alpha} - \alpha_0\|_2^2 = O_p(d n^{-1}) ForestRiesz regularity, moment bounds

4. Application in Sample Selection Models and Bias Decomposition

ForestRiesz extends naturally to causal inference under sample selection, where both treatment assignment and outcome observability can be non-random (Bjelac et al., 13 Jan 2026). For sample selection average treatment effect estimation, the Riesz representer admits an explicit expression involving treatment and selection propensities, and the bias from omitting latent confounders can be decomposed as

θ0θs=E[(g0gs)(α0αs)],\theta_0 - \theta_s = E[(g_0 - g_s)(\alpha_0 - \alpha_s)],

with an upper bound θ0θs2S~2CY2CS2|\theta_0 - \theta_s|^2 \leq \widetilde{S}^2 C_Y^2 C_S^2, where:

  • S~2\widetilde{S}^2 is a data-identified variance factor;
  • CY2C_Y^2 measures outcome confounding strength (partial R2R^2 with respect to latent AA);
  • CS2C_S^2 is the selection confounding strength (partial R2R^2 in the selection index).

ForestRiesz facilitates stable estimation in these settings, where direct propensity-score based methods can be numerically unstable. A quasi-Gaussian latent-index model provides a calibration method for sensitivity analysis, mapping the strength of unobserved confounding to the potential for treatment effect estimate overturning.

5. Simulation and Empirical Evidence

In simulation studies, ForestRiesz is benchmarked against conventional double machine learning (SSM) and naive approaches (IRM) (Bjelac et al., 13 Jan 2026). In a standard MAR selection design for ATE:

  • Both SSM and ForestRiesz recover the truth as nn increases.
  • ForestRiesz demonstrates superior stability and faster bias decay with default tunings; SSM can require careful hyperparameter adjustment.

Empirically, in U.S. gender wage gap analysis using American Community Survey data (2016), ForestRiesz yields larger estimated wage gaps (in absolute value) compared to unadjusted and propensity-score-based approaches. For example:

  • For college graduates, ForestRiesz estimates are 0.128-0.128 (SE 0.002), versus 0.0989-0.0989 for IRM (n=297, ⁣178n=297,\!178).
  • The approach detects underestimation of the wage gap by models that ignore sample selection.

Sensitivity analysis delivers explicit robustness values: overturning the wage gap would require unobserved confounding (partial R2>6.3%R^2>6.3\%), implying robustness to substantial levels of hidden selection bias.

6. Methodological Implications and Extensions

ForestRiesz provides a unified and robust estimator for general linear functionals—including but not limited to average treatment effects and average marginal effects—in the presence of complex sampling, high-dimensional covariates, and selective outcome observability. The method exploits the structure of the Riesz representer to automate debiasing and sidestep tuning-sensitive propensity or density estimation. This suggests ForestRiesz is particularly well-suited for finite samples, ill-posed inverse problems, and any context where orthogonality and stability are essential (Chernozhukov et al., 2021, Bjelac et al., 13 Jan 2026).

The integration of locally linear random forests for Riesz learning, coupled with doubly-robust cross-fitting and explicit influence-function-based sensitivity analysis, makes ForestRiesz a comprehensive tool for practitioners handling bias, regularization, and selection in modern causal inference frameworks.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to ForestRiesz.