Papers
Topics
Authors
Recent
Search
2000 character limit reached

Second-Order Stein's Formula

Updated 22 December 2025
  • Second-Order Stein’s Formula is a higher-order extension of Stein's lemma that quantifies variance and moments of functionals of Gaussian vectors.
  • It provides explicit variance decompositions and unbiased risk assessment methods for various high-dimensional estimators, including SURE and penalized methods.
  • The formula underpins applications in both classical and quantum statistics, refining inference, hypothesis testing, and model selection techniques.

The second-order Stein's formula provides higher-order identities for expectations, variances, and mixed moments associated with functions of Gaussian random vectors. These results extend Stein's original lemma, which underpins many risk estimation, inference, and concentration results in high-dimensional statistics. The second-order formula quantifies not only unbiasedness (as in Stein's lemma) but also characterizes the variance and higher moments of estimators arising from normal models, yielding exact variance formulas and enabling explicit risk bounds in shrinkage, model selection, de-biasing, and other high-dimensional methodologies. Related formulations appear in both classical and quantum statistics, where second-order expansions govern sharp asymptotics in hypothesis testing and information theory.

1. Classical Stein’s Formula and Its Higher-Order Extensions

The classical Stein identity for zN(0,In)z \sim N(0,I_n) and f ⁣:RnRnf\colon\mathbb{R}^n\to\mathbb{R}^n with integrable gradient states

$\E[z^\top f(z)] = \E[\operatorname{div} f(z)],$

or equivalently, $\E[z^\top f(z) - \operatorname{div} f(z)] = 0$. This first-order formula establishes unbiasedness properties for estimators such as SURE and forms the basis for stability analysis under Gaussian noise.

Second-order Stein identities generalize this result by establishing exact variance formulas. The principal second-order formula is, under mild Sobolev regularity,

$\operatorname{Var}[z^\top f(z) - \operatorname{div} f(z)] = \E[\|f(z)\|^2] + \E[\operatorname{Tr}((\nabla f(z))^2)],$

where f(z)\nabla f(z) is the Jacobian of ff and Tr\operatorname{Tr} denotes the matrix trace. This result, also termed the "SOS" (Second-Order Stein) identity, provides explicit variance decompositions (Bellec et al., 2018).

2. General Second-Order Stein Formula for Multivariate Normal Distributions

For XN(μ,Σ)X \sim N(\mu, \Sigma) and fC(RN)f \in C^{\infty}(\mathbb{R}^N), the second-order version of Stein’s identity governs expectations of quadratic forms times nonlinear functions. The theorem by Mamis (Mamis, 2022) shows for indices i,ji,j,

$\E[X_i X_j f(X)] = \mu_i\mu_j\E f(X) + \mu_i \sum_k \Sigma_{jk} \E[\partial_k f(X)] + \mu_j \sum_k \Sigma_{ik} \E[\partial_k f(X)] + \Sigma_{ij} \E f(X) + \sum_{k,\ell} \Sigma_{jk} \Sigma_{i\ell} \E[\partial_{k\ell}f(X)].$

In matrix notation,

$\E[ X X^\top f(X) ] = \mu\mu^\top \E[f(X)] + \mu (\Sigma \E[\nabla f(X)])^\top + (\Sigma \E[\nabla f(X)]) \mu^\top + \Sigma\, \E[f(X)] + \Sigma\, \E[\nabla^2 f(X)]\, \Sigma,$

where 2f(X)\nabla^2 f(X) is the Hessian. This formula follows from an explicit combinatorial expansion of moments for Gaussian vectors (Mamis, 2022).

3. Variance and Risk Formulas in High-Dimensional Inference

The direct application of the SOS identity enables unbiased risk estimation for SURE (Stein's unbiased risk estimate) and its own risk (termed "SURE for SURE"). In the sequence model y=μ+ϵy = \mu + \epsilon, ϵN(0,σ2In)\epsilon \sim N(0, \sigma^2 I_n), for a differentiable estimator μ^(y)\hat\mu(y),

$\SURE = \|y - \hat\mu(y)\|^2 + 2\sigma^2 \operatorname{div} \hat\mu(y) - \sigma^2 n,$

with $\E[\SURE] = \E[\|\hat\mu - \mu\|^2]$. The second-order formula yields the risk of SURE,

$R_{\SURE} = \E[ (\SURE - \|\hat\mu - \mu\|^2 )^2 ] = 4 \sigma^2 \E[\|y-\hat\mu\|^2] + 4\sigma^4 \E[ \operatorname{Tr}[ (\nabla \hat\mu)^2 ] ] - 2\sigma^4 n.$

The unbiased estimator “SURE for SURE” is

$\widehat R_{\SURE} = 4 \sigma^2 \| y-\hat\mu(y) \|^2 + 4 \sigma^4 \operatorname{Tr}[ (\nabla \hat\mu(y))^2 ] - 2\sigma^4 n.$

Such explicit second-order identities enable control of estimator variability and inform confidence sets, oracle inequalities, and inference de-biasing (Bellec et al., 2018).

4. Explicit Second-Order Formulas for Sparse Estimators

Second-order Stein-type calculus provides explicit risk and variance formulas for penalized estimators such as Lasso and Elastic Net in Gaussian linear models. For

β^=argminb{12nyXb2+λb1+γ2b2},\hat\beta = \arg\min_b \Big\{ \frac{1}{2n} \|y-Xb\|^2 + \lambda \|b\|_1 + \frac{\gamma}{2} \|b\|^2 \Big\},

let S^=supp(β^)\hat S = \operatorname{supp}(\hat\beta). Then almost everywhere,

div(Xβ^)=Tr(XS^(XS^XS^+γI)1XS^),\operatorname{div}(X\hat\beta) = \operatorname{Tr}\Big( X_{\hat S}( X_{\hat S}^\top X_{\hat S} + \gamma I )^{-1} X_{\hat S}^\top \Big),

yielding

$\SURE = \| y - X\hat\beta \|^2 + 2\sigma^2 \operatorname{Tr}( \cdots ) - \sigma^2 n,$

$\widehat R_{\SURE} = 4\sigma^2 \|y - X\hat\beta\|^2 + 4\sigma^4 \| X_{\hat S}(X_{\hat S}^\top X_{\hat S} + \gamma I)^{-1}X_{\hat S}^\top \|_F^2 - 2\sigma^4 n.$

For Lasso (γ=0\gamma=0), this reduces to closed expressions involving the active set size (Bellec et al., 2018).

5. Gaussian Moment Expansions and the Unified Higher-Order Stein Calculus

The formulas by Mamis (Mamis, 2022) provide a general approach for evaluating $\E[ X^{n} f(X) ]$ for arbitrary multi-indices nn, generalizing both Stein's lemma and Isserlis’ theorem. The identity expresses such expectations as finite sums over derivatives of ff weighted by combinatorial factors and covariance powers. This framework unifies various classical and modern identities for moments and mixed products of Gaussian vectors, formally subsuming second-order (and higher-order) Stein identities as special cases.

6. Quantum Analogues: Second-Order Stein’s Formula in Quantum Hypothesis Testing

In quantum information theory, sharp asymptotics in binary quantum hypothesis testing are governed by the second-order refinement of Stein’s lemma. For testing H0H_0: ρn\rho^{\otimes n} vs H1H_1: σn\sigma^{\otimes n} on Hilbert space H\mathcal{H}, let D(ρσ)D(\rho\|\sigma) and V(ρσ)V(\rho\|\sigma) denote the quantum relative entropy and variance: D(ρσ)=Tr[ρ(logρlogσ)],V(ρσ)=Tr[ρ(logρlogσ)2]D(ρσ)2.D(\rho\|\sigma) = \operatorname{Tr}[ \rho(\log\rho - \log\sigma) ], \qquad V(\rho\|\sigma) = \operatorname{Tr}[ \rho(\log\rho - \log\sigma)^2 ] - D(\rho\|\sigma)^2. The probability of type-I error, when type-II error exponent is tuned to D(ρσ)+E2/nD(\rho\|\sigma) + E_2/\sqrt{n}, converges to

Φ(E2/V(ρσ)),\Phi( E_2 / \sqrt{ V(\rho\|\sigma) } ),

where Φ\Phi is the normal cdf. Finite-sample bounds invoke Berry–Esseen corrections, and the methods rely on linear algebraic decompositions and central limit expansions paralleling the structure of second-order Stein-type formulas (Li, 2012).

7. Applications and Implications in High-Dimensional Statistics

Second-order Stein identities have driven applications beyond risk estimation. Consequences include:

  • Upper bounds for the risk of SURE in high-dimensional models;
  • Construction of asymptotically exact confidence regions based on SURE;
  • Oracle inequalities for SURE-tuned estimators;
  • Explicit variance bounds on model selection criteria (e.g., the active set size in Lasso, bounded by s0log2(p/s0)s_0 \log^2(p/s_0) under RE conditions);
  • Semiparametric de-biasing in linear regression, with variance of de-biased estimators given exactly by second-order Stein’s formula;
  • Accurate variance control in Gaussian Monte Carlo estimators for divergence, under minor regularity (Bellec et al., 2018).

These applications demonstrate the central role of second-order Stein calculus in modern high-dimensional inference, model assessment, and nonasymptotic uncertainty quantification.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Second-Order Stein's Formula.