Papers
Topics
Authors
Recent
Search
2000 character limit reached

Epsilon-Delta Analysis of Chatterjee's Rank Correlation

Updated 16 December 2025
  • The paper introduces an epsilon-delta framework that quantifies the stability and sensitivity of Chatterjee’s rank correlation under perturbations, providing tight contamination bounds (e.g., a 0.012 shift for 1% contamination).
  • The method utilizes asymptotic expected sensitivity functions and local L1 residuals to unify rank-based and moment-based dependence measures, thereby enhancing nonparametric inference.
  • The analysis establishes explicit ε–δ equivalences at independence and perfect dependence, ensuring robustness and continuity even under weak convergence and local distributional changes.

Chatterjee’s rank correlation, denoted ξ(X,Y)\xi(X, Y), is a nonparametric functional measuring the degree of association between random variables XX and YY that attains 1 for perfect functional dependence, 0 under independence, and interpolates smoothly in between. The ε\varepsilonδ\delta interpretation of this coefficient offers a rigorous quantification of its stability, sensitivity, and continuity under perturbations of the joint distribution—whether by gross contamination, weak convergence, or local dependence—providing tight analytical bounds for inference and robustness.

1. Fundamental Definitions and Forms

Chatterjee’s sample rank correlation, for i.i.d. data (Xi,Yi)(X_i, Y_i) with ties in XX resolved appropriately, is defined as

ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,

where rir_i are the ranks of Y(i)Y_{(i)} after sorting the data by XX0 (Chatterjee, 2019). The population version, critical for asymptotic and contamination analysis, is

XX1

where XX2 and XX3.

Chatterjee’s coefficient admits formulations via local XX4 residuals, Markov-product copulas, and conditional variances, which underlie the various XX5–XX6 analyses (Sato, 13 Dec 2025, Ansari et al., 14 Mar 2025).

2. Sensitivity and Robustness via Asymptotic Expected Sensitivity Function

The primary device for XX7–XX8 robustness is the Asymptotic Expected Sensitivity Function (AESF), defined for a functional XX9 as

YY0

where YY1 is the empirical plug-in estimator. For Chatterjee’s YY2, if the supremum YY3, then under YY4-contamination

YY5

one obtains the first-order contamination bound

YY6

and more conservatively for all YY7,

YY8

when the functional is Lipschitz in total variation (Zhang, 2024). This bound is tight: the worst case occurs when YY9 concentrates mass at the point where ε\varepsilon0 is maximized.

For example, in a linear-Gaussian case with ε\varepsilon1, a numerical value ε\varepsilon2 yields, for ε\varepsilon3, a maximal shift ε\varepsilon4 under 1% contamination.

3. ε\varepsilon5–ε\varepsilon6 Structure for Functional Dependence and Independence

Chatterjee’s coefficient exhibits explicit ε\varepsilon7–ε\varepsilon8 equivalences at the endpoints:

  • Functional dependence: If ε\varepsilon9 a.s., then δ\delta0. Conversely, if δ\delta1, δ\delta2 is almost surely a function of δ\delta3. Finite deviation from noiseless dependence, δ\delta4, yields δ\delta5 for δ\delta6, where

δ\delta7

(Chatterjee, 2019).

  • Independence: δ\delta8 if and only if δ\delta9 and (Xi,Yi)(X_i, Y_i)0 are independent. For uniform distance from independence (Xi,Yi)(X_i, Y_i)1, one gets (Xi,Yi)(X_i, Y_i)2. Conversely, if (Xi,Yi)(X_i, Y_i)3, then (Xi,Yi)(X_i, Y_i)4.

These bounds justify the interpretation of (Xi,Yi)(X_i, Y_i)5 as a calibrated, Lipschitz-quantified “distance” from both perfect dependence and independence, with explicit (Xi,Yi)(X_i, Y_i)6–(Xi,Yi)(X_i, Y_i)7 parameters.

4. Continuity: Markov Products and Weak Convergence

Continuity properties of (Xi,Yi)(X_i, Y_i)8 in the weak topology deviate from classical rank correlations. Chatterjee’s (Xi,Yi)(X_i, Y_i)9 is not continuous with respect to weak convergence of joint laws, but instead with respect to the law of Markov products XX0 where XX1 is conditionally independent given XX2 and XX3 (Ansari et al., 14 Mar 2025).

Theorem (ε–δ continuity of XX4): For XX5 continuous and suitable range convergence,

XX6

where XX7 is the Prokhorov distance. Copula-based representations yield bounds such as XX8 when the uniform norm XX9 (Ansari et al., 14 Mar 2025).

This ensures that small perturbations in the conditional law of ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,0, as measured in the appropriate metric (not simply the joint law), produce arbitrarily small effects on ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,1, with explicit ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,2–ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,3 quantification.

5. Primitive Local ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,4–ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,5 Construction and Empirical Structure

A local ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,6–ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,7 perspective frames Chatterjee’s ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,8 as the limiting residual of a local averaging scheme:

  • For ξn(X,Y)=13n21i=1n1ri+1ri,\xi_n(X, Y) = 1 - \frac{3}{n^2-1}\sum_{i=1}^{n-1}|r_{i+1} - r_i|,9 with rir_i0, rir_i1 (probability-integral transforms), define for rir_i2 the empirical rir_i3-neighborhood of rir_i4 as rir_i5.
  • The local average of rir_i6 near rir_i7 is rir_i8.
  • The mean local rir_i9 residual is Y(i)Y_{(i)}0.
  • In the Y(i)Y_{(i)}1 limit, Y(i)Y_{(i)}2 converges to Y(i)Y_{(i)}3; Chatterjee’s correlation emerges as

Y(i)Y_{(i)}4

matching the original rank-difference formula (Sato, 13 Dec 2025).

All Y(i)Y_{(i)}5–Y(i)Y_{(i)}6 operations (local sets, residuals) are invariant under monotone transformations; the probability-integral transform serves only to achieve distribution-freeness.

6. Moment-Based Analogues and Unified Framework

Replacement of the local Y(i)Y_{(i)}7 residual with Y(i)Y_{(i)}8 analogues links Chatterjee’s Y(i)Y_{(i)}9 to familiar moment-based indices:

  • XX00
  • XX01

For jointly Gaussian XX02, one recovers Pearson’s XX03 through this construction, showing the XX04–XX05 approach unifies rank-based and moment-based dependence measures under a single limiting framework (Sato, 13 Dec 2025).

7. Assumptions, Limitations, and Practical Implications

Rigorous XX06–XX07 control relies on continuity in XX08, regularity of XX09, and Hadamard differentiability of XX10. The main theoretical limits—tightness of the contamination bound and sharpness of independence/functionality bounds—are achieved under these hypotheses (Zhang, 2024, Chatterjee, 2019). In finite samples, contamination and sampling errors are additive, with the former scaling as XX11 and the latter as XX12.

A plausible implication is that for statistical inference and robust estimation, Chatterjee’s XX13 offers explicit, interpretable robustness margins, with XX14–XX15 quantification superior to earlier rank-based coefficients where such fine-grained control is unavailable or only asymptotically valid. The local XX16–XX17 interpretation remains central for applications in dependence quantification, goodness-of-fit, and model diagnostics.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Epsilon-Delta Interpretation of Chatterjee's Rank Correlation.