Second-Order Stein's Formula
- Second-Order Stein’s Formula is a higher-order extension of Stein's lemma that quantifies variance and moments of functionals of Gaussian vectors.
- It provides explicit variance decompositions and unbiased risk assessment methods for various high-dimensional estimators, including SURE and penalized methods.
- The formula underpins applications in both classical and quantum statistics, refining inference, hypothesis testing, and model selection techniques.
The second-order Stein's formula provides higher-order identities for expectations, variances, and mixed moments associated with functions of Gaussian random vectors. These results extend Stein's original lemma, which underpins many risk estimation, inference, and concentration results in high-dimensional statistics. The second-order formula quantifies not only unbiasedness (as in Stein's lemma) but also characterizes the variance and higher moments of estimators arising from normal models, yielding exact variance formulas and enabling explicit risk bounds in shrinkage, model selection, de-biasing, and other high-dimensional methodologies. Related formulations appear in both classical and quantum statistics, where second-order expansions govern sharp asymptotics in hypothesis testing and information theory.
1. Classical Stein’s Formula and Its Higher-Order Extensions
The classical Stein identity for and with integrable gradient states
$\E[z^\top f(z)] = \E[\operatorname{div} f(z)],$
or equivalently, $\E[z^\top f(z) - \operatorname{div} f(z)] = 0$. This first-order formula establishes unbiasedness properties for estimators such as SURE and forms the basis for stability analysis under Gaussian noise.
Second-order Stein identities generalize this result by establishing exact variance formulas. The principal second-order formula is, under mild Sobolev regularity,
$\operatorname{Var}[z^\top f(z) - \operatorname{div} f(z)] = \E[\|f(z)\|^2] + \E[\operatorname{Tr}((\nabla f(z))^2)],$
where is the Jacobian of and denotes the matrix trace. This result, also termed the "SOS" (Second-Order Stein) identity, provides explicit variance decompositions (Bellec et al., 2018).
2. General Second-Order Stein Formula for Multivariate Normal Distributions
For and , the second-order version of Stein’s identity governs expectations of quadratic forms times nonlinear functions. The theorem by Mamis (Mamis, 2022) shows for indices ,
$\E[X_i X_j f(X)] = \mu_i\mu_j\E f(X) + \mu_i \sum_k \Sigma_{jk} \E[\partial_k f(X)] + \mu_j \sum_k \Sigma_{ik} \E[\partial_k f(X)] + \Sigma_{ij} \E f(X) + \sum_{k,\ell} \Sigma_{jk} \Sigma_{i\ell} \E[\partial_{k\ell}f(X)].$
In matrix notation,
$\E[ X X^\top f(X) ] = \mu\mu^\top \E[f(X)] + \mu (\Sigma \E[\nabla f(X)])^\top + (\Sigma \E[\nabla f(X)]) \mu^\top + \Sigma\, \E[f(X)] + \Sigma\, \E[\nabla^2 f(X)]\, \Sigma,$
where is the Hessian. This formula follows from an explicit combinatorial expansion of moments for Gaussian vectors (Mamis, 2022).
3. Variance and Risk Formulas in High-Dimensional Inference
The direct application of the SOS identity enables unbiased risk estimation for SURE (Stein's unbiased risk estimate) and its own risk (termed "SURE for SURE"). In the sequence model , , for a differentiable estimator ,
$\SURE = \|y - \hat\mu(y)\|^2 + 2\sigma^2 \operatorname{div} \hat\mu(y) - \sigma^2 n,$
with $\E[\SURE] = \E[\|\hat\mu - \mu\|^2]$. The second-order formula yields the risk of SURE,
$R_{\SURE} = \E[ (\SURE - \|\hat\mu - \mu\|^2 )^2 ] = 4 \sigma^2 \E[\|y-\hat\mu\|^2] + 4\sigma^4 \E[ \operatorname{Tr}[ (\nabla \hat\mu)^2 ] ] - 2\sigma^4 n.$
The unbiased estimator “SURE for SURE” is
$\widehat R_{\SURE} = 4 \sigma^2 \| y-\hat\mu(y) \|^2 + 4 \sigma^4 \operatorname{Tr}[ (\nabla \hat\mu(y))^2 ] - 2\sigma^4 n.$
Such explicit second-order identities enable control of estimator variability and inform confidence sets, oracle inequalities, and inference de-biasing (Bellec et al., 2018).
4. Explicit Second-Order Formulas for Sparse Estimators
Second-order Stein-type calculus provides explicit risk and variance formulas for penalized estimators such as Lasso and Elastic Net in Gaussian linear models. For
let . Then almost everywhere,
yielding
$\SURE = \| y - X\hat\beta \|^2 + 2\sigma^2 \operatorname{Tr}( \cdots ) - \sigma^2 n,$
$\widehat R_{\SURE} = 4\sigma^2 \|y - X\hat\beta\|^2 + 4\sigma^4 \| X_{\hat S}(X_{\hat S}^\top X_{\hat S} + \gamma I)^{-1}X_{\hat S}^\top \|_F^2 - 2\sigma^4 n.$
For Lasso (), this reduces to closed expressions involving the active set size (Bellec et al., 2018).
5. Gaussian Moment Expansions and the Unified Higher-Order Stein Calculus
The formulas by Mamis (Mamis, 2022) provide a general approach for evaluating $\E[ X^{n} f(X) ]$ for arbitrary multi-indices , generalizing both Stein's lemma and Isserlis’ theorem. The identity expresses such expectations as finite sums over derivatives of weighted by combinatorial factors and covariance powers. This framework unifies various classical and modern identities for moments and mixed products of Gaussian vectors, formally subsuming second-order (and higher-order) Stein identities as special cases.
6. Quantum Analogues: Second-Order Stein’s Formula in Quantum Hypothesis Testing
In quantum information theory, sharp asymptotics in binary quantum hypothesis testing are governed by the second-order refinement of Stein’s lemma. For testing : vs : on Hilbert space , let and denote the quantum relative entropy and variance: The probability of type-I error, when type-II error exponent is tuned to , converges to
where is the normal cdf. Finite-sample bounds invoke Berry–Esseen corrections, and the methods rely on linear algebraic decompositions and central limit expansions paralleling the structure of second-order Stein-type formulas (Li, 2012).
7. Applications and Implications in High-Dimensional Statistics
Second-order Stein identities have driven applications beyond risk estimation. Consequences include:
- Upper bounds for the risk of SURE in high-dimensional models;
- Construction of asymptotically exact confidence regions based on SURE;
- Oracle inequalities for SURE-tuned estimators;
- Explicit variance bounds on model selection criteria (e.g., the active set size in Lasso, bounded by under RE conditions);
- Semiparametric de-biasing in linear regression, with variance of de-biased estimators given exactly by second-order Stein’s formula;
- Accurate variance control in Gaussian Monte Carlo estimators for divergence, under minor regularity (Bellec et al., 2018).
These applications demonstrate the central role of second-order Stein calculus in modern high-dimensional inference, model assessment, and nonasymptotic uncertainty quantification.