Repeatedly Nested Expectations

Updated 10 February 2026

Repeatedly Nested Expectations (RNEs) are mathematical constructs involving recursive compositions of non-linear expectation operators, crucial for modeling multi-level uncertainty.
They appear in diverse applications such as stochastic control, finance, probabilistic programming, and network games where layered decision processes and inner estimations are essential.
Advanced estimation methods like Nested Monte Carlo, multilevel techniques, and quantum algorithms address the computational challenges by mitigating bias and improving convergence rates.

Repeatedly nested expectations (RNEs) are mathematical objects that arise when one must compute compositions of expectations, where each stage involves a potentially nonlinear function of an inner expectation—often recursively, and with arbitrary depth. These structures are central in diverse areas, including probabilistic programming, optimal stopping, risk-sensitive control, multi-agent network games, and the simulation of stochastic differential equations. RNEs present unique computational and theoretical challenges due to the growth in complexity with each level of nesting and the proliferation of bias and variance with standard estimation approaches.

1. Formal Definition and Mathematical Structure

An $L$ -level repeatedly nested expectation has the canonical form: $I = \mathbb{E}_{X_0 \sim Q_0}\left[ f_1 \left( \mathbb{E}_{X_1\sim Q_1(\cdot|X_0)}\left[ f_2 \left(\cdots f_L(X_L)\cdots\right) \right] \right) \right]$ where each $Q_\ell$ is a (possibly conditional) probability measure on the respective space, and $f_\ell$ are link functions, possibly nonlinear and usually assumed to possess some regularity (Lipschitz, or higher smoothness).

Alternatively, a recursive definition applies: $\gamma_L(x_{L-1}) = \mathbb{E}_{X_L\sim Q_L(\cdot|x_{L-1})}\left[f_L(X_L)\right], \quad \gamma_\ell(x_{\ell-1}) = f_\ell(\gamma_{\ell+1}(x_{\ell-1}))$ with the full expectation $I = \mathbb{E}_{X_0}[ \gamma_1(X_0) ]$ .

The structure of RNEs is such that each level's integrand is itself a function of the output of a deeper expectation; this recursive dependence is the origin of significant computational hardness, most notably the “curse of depth” for naive estimators (Rainforth et al., 2016).

2. Occurrence in Applications

RNEs are central in multiple domains:

Stochastic Control & Finance: Valuation of American and Bermudan options, dynamic programming in finite-horizon optimal stopping, and risk-averse optimization all require the iteration of conditional expectations, sometimes across several sources of randomness (Beck et al., 2020, Sun et al., 8 Feb 2026).
Uncertainty Quantification: In global sensitivity analysis and Bayesian design, quantifying risk or expected improvement often yields deeply nested expectations (Hironaka et al., 2023).
Probabilistic Programming: The semantics of probabilistic programs induce repeated nesting when models invoke stochastic conditions or inner inference procedures (Rainforth et al., 2016).
Networked Multi-Agent Systems: In economic theory, higher-order expectations—agents’ beliefs about others’ beliefs—are mathematically expressed as iterated expectation operators across network topologies (Golub et al., 2020).

3. Computational Schemes and Complexity

3.1 Nested Monte Carlo (NMC)

The straightforward approach estimates nested expectations with a separate Monte Carlo procedure at each level. For a $k$ -fold nested expectation, the mean-squared error (MSE) using $N_i$ samples at layer $i$ is: $\mathrm{MSE} = O\left( \sum_{i=1}^k \frac{1}{N_i} \right)$ and with a total budget $T = \prod_{i=1}^k N_i$ , a balanced allocation yields

$\mathrm{MSE} = O\left( k\,T^{-1/k} \right)$

This convergence rate deteriorates exponentially in $k$ : the cost for $\varepsilon$ -accuracy is $O(\varepsilon^{-k})$ (Rainforth et al., 2016). Moreover, general NMC estimators are necessarily biased due to the nonlinearity of outer functions with respect to inner expectations.

3.2 Multilevel and Full-History Recursive Approaches

Advances in multilevel Monte Carlo (MLMC) and related techniques overcome this curse of depth. Recursive multilevel Picard (MLP) algorithms (Beck et al., 2020) construct full-history estimators sharing simulations across levels via telescoping sums, under suitable Lipschitz or contractivity assumptions, yielding

$\text{Cost} = O(d \cdot \varepsilon^{-(2+\delta)})$

where $d$ is the problem dimension and $\delta>0$ (Beck et al., 2020). Optimal randomized multilevel methods (e.g., the READ estimator) further achieve

$O(\varepsilon^{-2})$

for fixed depth and strong regularity, and $O(\varepsilon^{-2(1+\delta)})$ for merely Lipschitz continuous link functions (Syed et al., 2023).

Method	Error Rate	Cost Scaling	Conditions
NMC	$O(T^{-1/k})$	$O(\varepsilon^{-k})$	General, but biased
MLP/READ	$O(\varepsilon^2)$	$O(\varepsilon^{-2})$	Lipschitz/contractive links
Kernel Quadrature	Problem-specific	$O(\varepsilon^{-2})$ (best)	Sufficient smoothness (Chen et al., 25 Feb 2025)

4. Unbiased Estimation and Randomized Multilevel Methods

Unbiased estimators for RNEs circumvent the negative result on general-purpose NMC bias (Rainforth et al., 2016) by employing randomized telescoping expansions and antithetic corrections (e.g., Russian Roulette estimators). The READ estimator (Syed et al., 2023) applies at each nesting level a randomized MLMC procedure:

For each layer, a geometric number of inner samples is drawn, with estimators constructed to ensure the overall output is unbiased.
The complexity is controlled via careful choice of geometric probabilities; for models with bounded second derivatives with respect to the inner argument ("LBS" condition), the central limit theorem applies and batch averages converge with $O(\varepsilon^{-2})$ cost.
Under just Lipschitz regularity, similar nearly-optimal bounds hold for the mean absolute error.

These techniques support massive parallelization, as independent unbiased replicates can be averaged arbitrarily.

5. Advanced Algorithms: Kernel Quadrature and Sparse Grid Methods

Nested Kernel Quadrature (NKQ)

For problems where the integrands are sufficiently smooth (lying in Sobolev spaces), NKQ (Chen et al., 25 Feb 2025) replaces MC estimators at each layer with reproducing kernel Hilbert space methods. The recursive application of kernel quadrature at each level leverages smoothness to provide convergence rates up to $O(\epsilon^{-2})$ , dramatically outperforming MC when $s_\ell \gg d_\ell/2$ for Sobolev index $s_\ell$ and dimension $d_\ell$ , and especially efficient for moderate-depth, low-dimensional, smooth problems.

Sparse-Grid Monte Carlo

For cases where sampling from inner conditional distributions is infeasible or carries high computational burden, sparse-grid approaches (Hironaka et al., 2023) employ stratification and telescoping sums over partitions of joint samples, achieving

$\text{MSE} = O\left(N^{-1/K}\,(\log N)^2\right)$

when the outer (non-nested) dimension is $K$ . This is efficient for low $K$ but suffers from the curse of dimensionality as $K$ increases.

6. Quantum Algorithms for RNEs

Quantum algorithms further improve the scaling of RNE estimation. A quantum version of derandomized MLMC—quantizing each level via Quantum Amplitude Estimation—achieves worst-case cost

$\tilde{O}(\varepsilon^{-1})$

for fixed nesting depth and under suitable bounded Lipschitz conditions on the integrands (Sun et al., 8 Feb 2026). This matches the quantum lower bound for single-level expectation, hence is optimal up to logarithmic factors. Applications include optimal stopping (e.g., Bermudan option pricing), nested risk estimation, and probabilistic program semantics, where the RNE problem is prevalent. This quantum approach represents an almost-quadratic speedup over the best classical algorithms.

7. Higher-Order Expectations in Networked Systems

In multi-agent settings and network games, RNEs manifest as "higher-order average expectations": $E_i^k[y](t^i) = \sum_{j\in N}\gamma^{ij} E^i\bigl[E_j^{k-1}[y]\bigr](t^i)$ where $\gamma^{ij}$ represent network weights and $E^i$ are conditional expectations relative to agent $i$ ’s information (Golub et al., 2020). The limiting consensus expectation is determined by the stationary distribution of an associated Markov chain, integrating network structure and private information. Applications include economic coordination, speculative markets, and the analysis of "contagion" phenomena such as cascades of optimism or the "tyranny of the least-informed".

References

(Rainforth et al., 2016) Rainforth et al., On the Pitfalls of Nested Monte Carlo
(Beck et al., 2020) Hutzenthaler et al., Nonlinear Monte Carlo methods with polynomial runtime for high-dimensional iterated nested expectations
(Syed et al., 2023) Li et al., Optimal randomized multilevel Monte Carlo for repeatedly nested expectations
(Hironaka et al., 2023) Hironaka and Goda, Estimating nested expectations without inner conditional sampling and application to value of information analysis
(Chen et al., 25 Feb 2025) Oates et al., Nested Expectations with Kernel Quadrature
(Sun et al., 8 Feb 2026) Childs et al., Optimal Quantum Speedups for Repeatedly Nested Expectation Estimation
(Golub et al., 2020) Golub and Morris, Expectations, Networks, and Conventions