Generalized Distance Covariance Overview
- Generalized distance covariance is a framework that measures dependence between random elements, enabling tests of independence under minimal moment conditions.
- It employs negative definite functions derived from symmetric Lévy measures to generalize classical distance covariance while accommodating various distance structures.
- Its versatile methodology unifies distance and Gaussian covariance concepts, offering consistent and efficient estimation for multivariate and functional data analysis.
Generalized distance covariance is a framework for measuring dependence between random elements—typically random vectors, but extendable to broad classes of metric or semimetric spaces—where traditional Euclidean distances and α-stable weights are replaced by negative definite functions derived from symmetric Lévy measures. This generalization preserves the core properties of distance covariance, such as characterizing independence and enabling efficient empirical estimation, while accommodating less restrictive moment conditions and a wide variety of alternative distance structures in finite or infinite dimensions. The most significant outcomes include broadened applicability, natural integration of process-based and metric-based dependence, explicit connections to Gaussian (Brownian) covariance, and foundational implications for distance multivariance and tests of mutual independence (Böttcher et al., 2017).
1. Formal Definition and General Framework
Let and be random vectors with joint characteristic function and marginals , . The generalized distance covariance formally arises by selecting symmetric Lévy measures (on ) and (on ), forming the product measure on 0, and constructing the associated real-valued continuous negative definite functions ("cndfs")
1
The generalized distance covariance is then
2
or, equivalently,
3
The classical Székely–Rizzo–Bakirov distance covariance is recovered as a special case by taking 4 and 5 to be (possibly isotropic) symmetric 6-stable Lévy measures, in which case 7 and 8 for 9 (Böttcher et al., 2017).
2. Theoretical Properties and Characterization of Independence
By construction, 0 and
1
i.e., generalized distance covariance is zero if and only if 2 and 3 are independent. The moment conditions required for finiteness of 4 relax those of the standard framework: it suffices that 5 and 6, compared to the original requirement that 7 for 8. Thus, substantially weaker moment assumptions are admissible in the Lévy-based generalization (Böttcher et al., 2017).
3. Alternative Distance Structures and Negative Definite Functions
By varying the underlying Lévy measures, one synthesizes alternative cndfs and “metric” structures compatible with generalized distance covariance. Examples include:
- Minkowski 9-distance for 0:
1
arising from product measures of one-dimensional 2-stable Lévy laws.
- Cauchy-type distance: 3, corresponding to isotropic Cauchy measures 4.
- Truncated or compound Poisson distances: 5 for finite 6.
This flexibility enables practitioners to tailor the dependence measure to the specific probabilistic structure or application requirements, and was motivated explicitly by the aim of handling more general types of data and stochastic processes (Böttcher et al., 2017).
4. Computation, Empirical Estimation, and Consistency
Empirical generalized distance covariance is a V-statistic based on a four-fold kernel:
7
The construction proceeds via distance matrices 8 and 9, which are double-centered to obtain 0 and 1 (using the centering matrix 2). The estimator
3
is strongly consistent under minimal moment assumptions. When 4, a central limit theorem holds for the scaled statistic 5, whose limit is the squared 6-norm of a centered Gaussian field corresponding to the chosen Lévy measure (Böttcher et al., 2017).
5. Relationship to Gaussian Covariance and Brownian Covariance
Generalized distance covariance unifies distance-based dependence with covariance structures defined via Gaussian fields. For a mean-zero Gaussian field 7 (indexed over 8) with covariance 9, define the centered variables 0, and similarly for 1. Then,
2
establishing exact agreement between generalized distance covariance and “Gaussian covariance,” with Brownian covariance corresponding to the special case 3 (Böttcher et al., 2017, Székely et al., 2010).
6. Extension to Multivariate, Functional, and General Metric Space Settings
Generalized distance covariance serves as the foundation for “distance multivariance,” a measure of mutual dependence among 4 random vectors, and has been characterized in a fully abstract metric space framework. Lyons (Lyons, 2011) demonstrated that the key requirement for the independence characterization is that metrics have strong negative type, encompassing all separable Hilbert spaces (and hence infinite-dimensional settings, functional data, and stochastic processes (Matsui et al., 2017, Janson, 2019)). The Hilbert space embedding connects generalized distance covariance to Hilbert–Schmidt operators, revealing that for 5, the measure specializes to four times the squared Hilbert–Schmidt norm of the cross-covariance operator (Janson, 2019).
7. Implications, Practical Use, and Limitations
Generalized distance covariance delivers a unified and versatile dependence measure with the following high-level features:
- Precisely characterizes independence for a broad class of random elements under minimal moments.
- Admits versions tailored to alternative distances, general kernels, or semimetrics (including kernel distance covariance and its Hilbert–Schmidt generalizations (Han et al., 2021)).
- Enables efficient, consistent estimation and permutation-based inference.
- Extends to mutual independence testing and independent component analysis (Jin et al., 2017, Matteson et al., 2013).
- Proves critical in high-dimensional and functional data regimes, and demonstrates power universality and minimax optimality among tests relying on pairwise distances or kernels (Han et al., 2021). A key limitation is the dependence of definitions and equivalence of forms on moment conditions; the integral expressions may be infinite or undefined if the tail behavior of 6 and 7 is too heavy and required moments fail to exist (Janson, 2019). For practical implementation, practitioners must verify that underlying Lévy-based negative definite functions yield finite expectations.
References:
- "Detecting independence of random vectors: generalized distance covariance and Gaussian covariance" (Böttcher et al., 2017)
- "On distance covariance in metric and Hilbert spaces" (Janson, 2019)
- "Distance covariance in metric spaces" (Lyons, 2011)
- "Brownian distance covariance" (Székely et al., 2010)
- "Generalized kernel distance covariance in high dimensions: non-null CLTs and power universality" (Han et al., 2021)
- "Generalizing Distance Covariance to Measure and Test Multivariate Mutual Dependence" (Jin et al., 2017)