Berry-Esseen Bounds for High-D Self-Normalized Sums

Updated 30 December 2025

The paper establishes explicit Berry–Esseen bounds for self-normalized sums, achieving rates as fast as O((log d)^(3/2)/√n) under finite third moment conditions.
The methodology employs truncation, smoothing, and Taylor expansions to manage nonlinearities introduced by data-dependent normalization in high dimensions.
The results clarify the trade-off between moment assumptions, sample size, and dimension growth, optimizing convergence in multivariate statistical inference.

High-dimensional self-normalized sums arise in multivariate statistical inference, especially in cases where the dimensionality of observed random vectors grows with sample size. The Berry-Esseen bound quantifies the rate of convergence in the Central Limit Theorem (CLT), measuring how closely the distribution of a properly normalized sum approximates a Gaussian law. In high dimensions, the interplay between sample size, dimension, and moment assumptions becomes critical, particularly for self-normalized statistics, where scaling by the data-dependent standard deviation introduces strong dependencies and nonlinearities. Recent work establishes explicit Berry-Esseen type bounds for these self-normalized sums and their maxima, significantly advancing the understanding of high-dimensional CLTs under relaxed moment assumptions (Das, 2020, Chang et al., 15 Jan 2025).

1. Problem Formulation and Self-normalized Sums

Given a sequence of independent, identically distributed (IID), mean-zero random vectors $X_i = (X_{i1}, \dots, X_{id})^\top \in \mathbb{R}^d$ , the primary object of interest is the self-normalized sum: $T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ for $j=1,\ldots,d$ . For coordinate-wise inference, the distribution of $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ is studied, as well as the uniform approximation of $T_n$ over classes of hyper-rectangles in $\mathbb{R}^d$ .

The Berry-Esseen distance for evaluating the approximation to a multivariate normal is given by

$\Delta_n = \sup_{A\in\mathcal{A}^{\text{re}}} \left| \Pr\{T_n \in A\} - \Pr\{Z \in A\} \right|$

where $Z \sim N(0, I_d)$ and $\mathcal{A}^{\text{re}}$ denotes the class of hyper-rectangles.

2. Explicit High-dimensional Berry-Esseen Bounds

Recent results provide explicit Berry-Esseen bounds for the approximation of the law of $T_n$ (or its coordinatewise maximum) by an appropriate Gaussian distribution (Das, 2020, Chang et al., 15 Jan 2025).

(a) Berry-Esseen Bound for Hyper-rectangles

Under the assumptions that each $T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 0 is mean-zero, with finite $T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 1-th moment for some $T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 2, and the sequence is IID across $T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 3, the following bound is obtained ((Das, 2020), Theorem 6):

$T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 4

where

$T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 5

When both $T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 6 and $T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 7 are bounded away from $T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 8 and $T_n = (T_{n,1},\dots,T_{n,d})^\top, \qquad T_{n,j} = \frac{\sum_{i=1}^n X_{ij}}{\sqrt{\sum_{i=1}^n X_{ij}^2}}$ 9,

$j=1,\ldots,d$ 0

For the case $j=1,\ldots,d$ 1 (finite third moment), the bound becomes

$j=1,\ldots,d$ 2

which matches the classical rate $j=1,\ldots,d$ 3 for the univariate Berry-Esseen theorem, up to a logarithmic factor in $j=1,\ldots,d$ 4.

(b) Berry-Esseen Bound for Maxima (Coordinatewise Maximum)

A complementary approach provides explicit, nonasymptotic bounds for the Kolmogorov distance between $j=1,\ldots,d$ 5 and its Gaussian counterpart (Chang et al., 15 Jan 2025). Assuming finite third absolute moments,

$j=1,\ldots,d$ 6

where the infimum is taken over all mean-zero $j=1,\ldots,d$ 7-variate Gaussians with correlation matrices. The bound vanishes as $j=1,\ldots,d$ 8 provided

$j=1,\ldots,d$ 9

A moment-matching version $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 0 controls the error for Gaussian approximations with the actual covariance of $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 1.

3. Moment Assumptions and Dimension Growth

The fundamental trade-off in high-dimensional CLTs with self-normalized sums is between the required finite moment, the dimension $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 2, and the sample size $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 3. For the error bound to vanish, the growth of dimension is controlled by: $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 4

$\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 5

For $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 6 (finite third moment), the regime $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 7 is sufficient to vanishing error in the Berry-Esseen sense for uniform approximation over rectangles.

This is in contrast to non-self-normalized sums, which typically require only polylogarithmic dependence of $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 8 on $\|T_n\|_\infty = \max_{1\leq j\leq d} T_{n,j}$ 9 for uniform CLT results.

4. Core Proof Strategies

The derivation of Berry-Esseen bounds for self-normalized sums in high dimensions fundamentally departs from traditional approaches for sums of independent vectors.

Key steps include:

Componentwise reduction: Use independence of $T_n$ 0 (or factorization over rectangles) to reduce the multivariate problem to sums of one-dimensional bounds.
Refined Berry–Esseen for self-normalized sums: Deploy one-dimensional results of Jing–Shao–Wang (2003), Bentkus–Götze (1996), and Shao (2005) to control the error for self-normalized quantities.
Truncation and smoothing: Truncate coordinates to manage heavy tails and introduce a smooth surrogate for $T_n$ 1, enabling Taylor expansion and smoothing arguments.
Gaussian anti-concentration: The $T_n$ 2 factors arise from multivariate Gaussian anti-concentration and complexity of the $T_n$ 3-norm.
Balancing approximation and smoothing bias: Choose smoothing and truncation parameters to optimize the interplay between stochastic remainders and deterministic bias, establishing the explicit rates in $T_n$ 4 and $T_n$ 5.

A summary of the main proof ingredients and their quantitative contributions is provided in the following table:

Step	Contribution to Bound	Source
Truncation/linearization	Controls $T_n$ 6 of large values	(Chang et al., 15 Jan 2025)
Smoothing/Taylor expansion	Contributes $T_n$ 7 exponent	(Chang et al., 15 Jan 2025)
One-dimensional BE bound	Determines $T_n$ 8 exponent	(Chang et al., 15 Jan 2025)
Anti-concentration	Further $T_n$ 9 growth in constants	(Das, 2020)

For sums of independent vectors (without normalization), Berry-Esseen bounds of order $\mathbb{R}^d$ 0 are attainable (Chernozhukov–Chetverikov–Kato, Kuchibhotla–Chakrabortty). Self-normalized statistics, however, exhibit fundamentally greater complexity: the normalization introduces high dependence and nonlinearity, precluding direct application of previous high-dimensional CLTs (Chang et al., 15 Jan 2025).

Earlier high-dimensional Berry-Esseen rates for self-normalized sums held only under exponential-moment or independence-across- $\mathbb{R}^d$ 1 assumptions. The new results (Das, 2020, Chang et al., 15 Jan 2025) relax these requirements to polynomial moments and accommodate arbitrary covariance structures (for maxima), providing the first explicit bounds in these regimes.

The bounds are also shown to be optimal in the sense that for $\mathbb{R}^d$ 2, one cannot do better than $\mathbb{R}^d$ 3 in general ((Das, 2020), Proposition 4.1).

6. Refined Bounds, Applications, and Future Directions

Stronger moment assumptions (e.g., finite fourth moment) or refined Lindeberg interpolations may reduce the $\mathbb{R}^d$ 4 exponents and improve the $\mathbb{R}^d$ 5 rate to $\mathbb{R}^d$ 6, though at the expense of analytical and technical complexity (Chang et al., 15 Jan 2025).

The truncation-based approach for moment-matching bounds controls errors even when coordinate variances diverge, offering robustness to heavy-tailed data distributions. The coordinatewise formulation directly informs statistical inference via Student's $\mathbb{R}^d$ 7-statistic and the construction of simultaneous confidence intervals.

Extensions to dependent observations (e.g., mixing processes) remain an open problem.

7. Summary Table of Main Results

Reference	Assumptions	Bound	Dimension Growth Regime
(Das, 2020)	$\mathbb{R}^d$ 8 IID, $\mathbb{R}^d$ 9	$\Delta_n = \sup_{A\in\mathcal{A}^{\text{re}}} \left\| \Pr\{T_n \in A\} - \Pr\{Z \in A\} \right\|$ 0	$\Delta_n = \sup_{A\in\mathcal{A}^{\text{re}}} \left\| \Pr\{T_n \in A\} - \Pr\{Z \in A\} \right\|$ 1
(Chang et al., 15 Jan 2025)	$\Delta_n = \sup_{A\in\mathcal{A}^{\text{re}}} \left\| \Pr\{T_n \in A\} - \Pr\{Z \in A\} \right\|$ 2 IID, $\Delta_n = \sup_{A\in\mathcal{A}^{\text{re}}} \left\| \Pr\{T_n \in A\} - \Pr\{Z \in A\} \right\|$ 3	$\Delta_n = \sup_{A\in\mathcal{A}^{\text{re}}} \left\| \Pr\{T_n \in A\} - \Pr\{Z \in A\} \right\|$ 4	$\Delta_n = \sup_{A\in\mathcal{A}^{\text{re}}} \left\| \Pr\{T_n \in A\} - \Pr\{Z \in A\} \right\|$ 5

These results bridge the gap between classical Berry–Esseen theory and modern high-dimensional inference for self-normalized sums, providing explicit error rates and clarifying the interplay between moment control, dimensionality, and normalization.

Markdown Report Issue Upgrade to Chat

References (2)

Central Limit Theorem and Near classical Berry-Esseen rate for self normalized sums in high dimensions (2020)

The Berry-Esseen Bound for High-dimensional Self-normalized Sums (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Berry-Esseen Bound for High-dimensional Self-normalized Sums.