Papers
Topics
Authors
Recent
Search
2000 character limit reached

Azadkia-Chatterjee Coefficient

Updated 9 December 2025
  • The Azadkia–Chatterjee coefficient is a nonparametric, rank-based measure defined via conditional probability variance that ranges from 0 (independence) to 1 (functional dependence).
  • It employs graph-based estimators using nearest neighbor ranks to achieve strong consistency, parametric rates, and established asymptotic normality in both marginal and conditional settings.
  • Its extensions include multivariate responses and scale-invariant variants, making it central for independence testing, graphical models, and model-free variable selection.

The Azadkia–Chatterjee coefficient is a nonparametric, rank-based measure of directed dependence between a vector-valued predictor and a univariate or multivariate response, defined at the population level via the variance of conditional probabilities and estimated using nearest-neighbor graphs. It features an interpretable scale—zero under independence and one under functional dependence—and a graph-based empirical estimator that admits parametric rates, strong consistency, bandwidth-free implementation, and central limit theorems in both marginal and conditional versions. Multivariate extensions, scale-invariant variants, and connections to broader classes of geometric graph and kernel-based dependence measures position the coefficient as a central object for independence testing, graphical models, and model-free variable selection.

1. Definition and Fundamental Properties

Let (X,Y)(X,Y) be jointly distributed random elements with XRdX\in\mathbb{R}^d and YY either univariate or a vector in Rq\mathbb{R}^{q}. The Azadkia–Chatterjee (AC) coefficient for YY on XX is defined by

ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].

An equivalent form based on the cumulative distribution of YY yields, for continuous FYF_Y: ξ(Y,X)=6RVar(P(YyX))dPY(y)2.\xi(Y, X) = 6\int_\mathbb{R} \operatorname{Var}\left( P(Y\ge y \mid X) \right) dP^Y(y) - 2. Characterizing properties:

  • XRdX\in\mathbb{R}^d0 if and only if XRdX\in\mathbb{R}^d1 and XRdX\in\mathbb{R}^d2 are independent.
  • XRdX\in\mathbb{R}^d3 if and only if XRdX\in\mathbb{R}^d4 is almost surely a measurable function of XRdX\in\mathbb{R}^d5.

The definition is directional and scale-invariant: strictly increasing transformations of XRdX\in\mathbb{R}^d6 or bijections of XRdX\in\mathbb{R}^d7 preserve XRdX\in\mathbb{R}^d8 (Ansari et al., 14 Mar 2025, Ansari et al., 2022). For conditional dependence, define XRdX\in\mathbb{R}^d9 jointly and set

YY0

YY1 if and only if YY2, and YY3 if and only if YY4 is a function of YY5 given YY6 (Shi et al., 2021, Huang et al., 2020).

2. Graph-Based and Rank-Based Estimator Construction

For i.i.d. data YY7, construct the following graph-based estimator:

  • Compute the univariate ranks YY8.
  • Let YY9.
  • The empirical AC coefficient is

Rq\mathbb{R}^{q}0

This estimator generalizes Chatterjee's original proposal to multivariate covariates Rq\mathbb{R}^{q}1 by utilizing nearest-neighbor graphs in Rq\mathbb{R}^{q}2 (Lin et al., 2022).

Multivariate response: For Rq\mathbb{R}^{q}3, a "chain rule" or copula-based construction is used (Ansari et al., 2022, Huang et al., 8 Dec 2025): Rq\mathbb{R}^{q}4 with Rq\mathbb{R}^{q}5, reduces to Rq\mathbb{R}^{q}6 for Rq\mathbb{R}^{q}7, and can be strongly consistently estimated using graph-based estimators for each univariate constituent.

Scale invariance: The standard estimator is not invariant to affine changes in Rq\mathbb{R}^{q}8; a fully scale-invariant version uses coordinatewise rank transforms in Rq\mathbb{R}^{q}9 before constructing the NNG (Tran et al., 2024).

3. Distributional Properties and Limit Theory

Asymptotic Normality and Variance Bounds

The central limit theorem holds under broad conditions. For i.i.d. draws from a continuous law: YY0 whenever YY1 is not a measurable function of YY2 (Lin et al., 2022). The asymptotic variance YY3 satisfies: YY4 and, under absolute continuity of YY5, a sharper bound involving explicit dimension-dependent constants.

When YY6, YY7 with YY8, YY9 linked to the geometry of the NNG in XX0 (Lin et al., 2022, Han et al., 2022). Under manifold support, the limiting variance depends solely on the intrinsic dimension.

A consistent explicit estimator of the variance is available, allowing for valid inference (Lin et al., 2022).

Symmetric and Conditional Extensions

A symmetrized version, taking XX1, allows construction of two-sided tests—its limit law under independence is skew-normal with explicit variance (Zhang, 2022).

The conditional AC coefficient admits an empirical estimator with parallel asymptotics; under independence, XX2 is asymptotically normal with variance determined by the dimensions of the variables and graph-count statistics (Shi et al., 2021).

Continuity Considerations

Unlike classical measures (Spearman's rho, Kendall's tau), XX3 is not weakly continuous under distributional convergence. Instead, it is continuous with respect to convergence of Markov products (pairs of conditionally i.i.d. copies) under additional marginal quantile convergence or specific copula convergence. Practical families and models (ellipticals, Archimedeans, noises) satisfy required continuity, so stable large-sample inference is possible within these classes (Ansari et al., 14 Mar 2025).

4. Algorithmic and Computational Aspects

  • Nearest-neighbor graph construction can be done in XX4 (brute force for small XX5; kd-trees or approximate methods for larger XX6).
  • Rank computations for XX7 (and optionally for XX8 in the scale-invariant version) cost XX9 per coordinate.
  • Multivariate response: Efficient merge-sort or divide-and-conquer algorithms exist for blockwise rank counts, with time complexity ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].0 (Huang et al., 8 Dec 2025).
  • For each observation ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].1, nearest-neighbor search and rank calculations admit nearly linear scaling, enabling use in large datasets.

5. Connections to Broader Dependence Measures

The AC coefficient is a specific instance within the family of graph–RKHS–OT dependency measures (Deb et al., 2020, Deb et al., 2024):

  • Population level: For sufficiently rich kernels (e.g., the min kernel on ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].2, or the indicator-integral kernel), the corresponding normalized conditional MMD directly recovers ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].3.
  • Sample level: The estimator is a geometric graph functional over empirical OT ranks.
  • Distribution-free: Under the null of independence, the law of the AC coefficient (when computed using empirical OT ranks and graph structure) is exactly permutation invariant, enabling finite-sample calibration for independence tests.

Multivariate extensions (both in predictors and responses) and conditional variants fit naturally into this graph–kernel framework, relating directly to kernel partial correlation (Huang et al., 2020), distance multivariance, and more general measures indexed by RKHS (Deb et al., 2024).

6. Practical Application Domains

Independence and Conditional Independence Testing

The AC coefficient and its conditional extension are used for:

  • Testing independence in arbitrary dimensions (direct, distribution-free under the null, with consistent critical values).
  • Conditional independence testing, e.g., through graph-based statistics evaluated with (conditional) randomization tests (Shi et al., 2021). However, these are known to exhibit low local power against contiguous local alternatives unless the nearest-neighbor graph is appropriately generalized or replaced with ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].4-NN approaches.

Graphical Model Structure Learning

Pairwise conditional AC coefficients are used as entries in adjacency matrices for learning undirected graphs representing conditional independence relationships in high dimensions, outperforming standard penalized Gaussian graphical model approaches in various regimes (Furmańczyk, 2023).

Model-Free Feature Selection and Network Analysis

The multivariate T extension and its estimator enable:

7. Theoretical Limitations and Open Problems

  • Under local parametric or minimax-detection boundary alternatives, the standard 1-NN estimator is asymptotically powerless unless graph construction is strengthened (increasing ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].5 with ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].6) (Shi et al., 2021).
  • Weak continuity of ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].7 fails under convergence in law, but holds under stricter Markov-product and copula-derivative types of convergence, implying care is needed in statistical inference (Ansari et al., 14 Mar 2025).
  • In practical high-dimensional settings, the curse of dimensionality in nearest-neighbor search may be partially circumvented due to intrinsic dimension adaptivity, but further analysis on computational–statistical tradeoffs remains ongoing (Han et al., 2022).

References:

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Azadkia-Chatterjee Coefficient.