Sub-Weibull Distributions
- Sub-Weibull distributions are a family of probability models defined by stretched-exponential tail decay, generalizing both sub-Gaussian and sub-exponential types.
- They enable sharp non-asymptotic moment inequalities and concentration bounds essential for analyzing high-dimensional, heavy-tailed data.
- They are applied in robust covariance estimation, regression, and graphical models, with techniques for empirical tail index estimation playing a key role.
Sub-Weibull distributions constitute a parametrized family of probability distributions characterized by tails at least as light as, and potentially heavier than, the exponential decay of Weibull-type laws. They generalize both the sub-Gaussian and sub-exponential classes, interpolating between light-tailed and heavier, but still stretched-exponential, regimes. The framework of sub-Weibull random variables and vectors enables the development of non-asymptotic moment inequalities, sharp concentration results, and robust statistical tools for high-dimensional scenarios with heavy-tailed or non-sub-Gaussian data.
1. Formal Definitions and Characterizations
Sub-Weibull distributions can be defined and characterized via several equivalent formulations encompassing tail behavior, moment growth, Orlicz norms, and moment generating functions. For a real random variable , and tail index parameter , is called sub-Weibull of order (notation: ) if any of the following (equivalent up to constants) conditions hold (Vladimirova et al., 2019, He, 19 Dec 2025, Zhang et al., 2021):
- Tail Bound: such that
- Moment Growth: s.t.
- Orlicz Norm (ψ):
with .
- MGF-Type (Orlicz) Condition: so that
The sub-Weibull property is often indexed as , or, equivalently, as the class via the Orlicz norm.
A random vector is called sub-Weibull() if for all , is sub-Weibull of order .
2. Relation to Sub-Gaussian and Sub-Exponential Regimes
The sub-Weibull hierarchy recovers classical tail behaviors for special values of (), establishing a continuous spectrum of tail-decay and moment growth:
| () | Recovery | Tail Behavior | Moment Growth |
|---|---|---|---|
| () | sub-Gaussian | ||
| () | sub-Exponential | ||
| () | heavier-tailed sub-Weibull (stretched exp) |
The sub-Weibull family interpolates: as increases, distributions accommodate heavier tails (but all moments remain finite and grow polynomially in ; thus, tails remain lighter than power laws) (Vladimirova et al., 2019, He, 19 Dec 2025, Zhang et al., 2021).
3. Concentration Inequalities and Tail Bounds
Non-asymptotic concentration phenomena for sums of (possibly weighted) independent sub-Weibull random variables mirror classical Bernstein or Rosenthal bounds, but must also accommodate the heaviest allowed (stretched-exponential) deviations (Bong et al., 2023, Zhang et al., 2021):
- Hoeffding/Bernstein-Type Tail: If are independent, mean-zero, sub-Weibull() with common , then for suitable constant ,
- Two-Regime (GBO) Inequality: For sum , the probability satisfies for all
where the quadratic regime dominates for small and the stretched exponential for large (Bong et al., 2023, Zhang et al., 2021).
- Moment and Tail Equivalences: The following equivalence holds, up to constants:
This allows the deployment of uniform high-probability tail bounds and error analysis for sums, projections, or quadratic forms involving heavy-tailed (but sub-Weibull) random variables.
4. Estimation of the Tail Index and Empirical Techniques
For practical data analysis, estimation of the tail parameter (or ) is essential. For a random variable with a Weibull-type tail, the -quantile takes the form , yielding a log-quantile plot linear in with slope (Vladimirova et al., 2019):
- Linear Regression Estimator: Order data points as , select the largest, and regress
Then, estimate as the slope of linear regression of on .
- Moment Estimators: Compute empirical norms such as
as proxies for Orlicz- or moment-based sub-Weibull norms (Zhang et al., 2021).
- Cross-validation for : If is unknown, tuning via cross-validation enables empirical model selection in successive applications.
5. Closure Properties and Algebraic Operations
The sub-Weibull classes enjoy several algebraic closure and order properties (Vladimirova et al., 2019, Zhang et al., 2021):
- Inclusion: For ,
Heavier-tailed classes properly contain the lighter-tailed ones.
- Sum/Addition: If and ,
- Product: For , ,
- Powers: If , then , and (Zhang et al., 2021).
- Optimal Tail Index: If as , then is minimal for which .
6. Statistical Applications and Examples
The sub-Weibull formalism underpins robust statistical inference for high-dimensional and heavy-tailed data (Zhang et al., 2021, Bong et al., 2023, He, 19 Dec 2025):
- Covariance Estimation: For i.i.d. with sub-Weibull() tails, norm-/spectrally-truncated estimators achieve empirical error , preserving the sub-Gaussian rate even under significantly heavier marginal distributions (He, 19 Dec 2025).
- Negative Binomial Regression: For covariate vectors with , the -error of maximum-likelihood or Z-estimator coefficients admits sharp non-asymptotic risk bounds, with
where is the dimension and the sample size (Zhang et al., 2021).
- Random Matrix Theory: For isotropic matrices with sub-Weibull() rows, Bai–Yin-type spectral norm bounds and eigenvalue location properties extend, with deviations controlled by two-regime functions (Zhang et al., 2021).
- Graphical Models: Estimation of multiple precision/covariance matrices from high-dimensional data with sub-Weibull marginals attains sample complexity improvements, as the required scales nearly linearly in the number of models and dimension in contrast to quadratic dependence for truly heavy-tailed errors (Bong et al., 2023).
- Bayesian Neural Networks: Deep units composed of Gaussian-weighted layers induce sub-Weibull marginals with tail index directly determined by network depth, with empirical confirmation by slope-of-log-quantile regression (Vladimirova et al., 2019).
7. Extensions, Limitations, and Ongoing Research
Research continues on optimal constants in tail and moment inequalities and on concentration for dependent sub-Weibull processes (martingales, mixing arrays). Data-driven selection of and its impact on robustness of estimators remain partly open. Extensions to handle non-i.i.d., non-isotropic, or composite tails (e.g., COM-negative binomial) are under active investigation (Zhang et al., 2021).
The sub-Weibull class forms a natural one-parameter generalization for expressing and controlling stretched-exponential tail behavior in empirical processes, reinforcing its importance in the theoretical and applied statistics literature (Vladimirova et al., 2019, Bong et al., 2023, He, 19 Dec 2025, Zhang et al., 2021).