Kernel Quantile Discrepancies (KQDs)
- Kernel Quantile Discrepancies (KQDs) are statistical distances that embed quantiles in reproducing kernel Hilbert spaces to provide a finer representation of probability distributions.
- They extend traditional kernel mean embedding methods by incorporating generalized quantiles, capturing higher-order distributional differences and subsuming sliced Wasserstein distances.
- Efficient estimation procedures with near-linear computational cost make KQDs a robust alternative for two-sample testing and high-dimensional generative model evaluation.
Kernel Quantile Discrepancies (KQDs) are a family of statistical distances between probability distributions, constructed via the kernel quantile embedding (KQE) operator in reproducing kernel Hilbert spaces (RKHS). Extending the classical kernel mean embedding paradigm exemplified by maximum mean discrepancy (MMD), KQDs incorporate generalized quantiles as distributional features, yielding probability metrics under weaker kernel conditions, admitting efficient estimation, and subsuming kernelized forms of sliced Wasserstein distances (Naslidnyk et al., 26 May 2025).
1. Kernel Quantile Embedding Operator
Let denote a Hausdorff, separable, σ-compact Borel space and a continuous, measurable, and separating kernel, with associated RKHS and feature map . The unit sphere in , , indexes the projections of onto one-dimensional subspaces.
For a direction , define the one-dimensional pushforward measure by . The usual quantile of level is
The kernel quantile embedding of along direction and level is then
Equivalently, this is the -quantile of the pushforward of the in along , leveraging the canonical representation .
2. Definitions and Properties of Kernel Quantile Discrepancies
For a probability measure with full support on (quantile levels) and with full support on (RKHS directions), define the directional quantile distance of order between distributions as
There are two aggregations:
- Expected KQD (e-KQD):
- Supremum KQD (sup-KQD):
Provided the kernel assumptions and support conditions on and , both e-KQD and sup-KQD are metrics on the space of probability measures over . The injectivity of is established by an RKHS Cramér–Wold theorem, ensuring KQDs are distinguishing (zero KQD implies equality in law).
3. Relationships to MMD and Sliced Wasserstein Distances
The classical maximum mean discrepancy is defined as
with the kernel mean embedding. MMD is sensitive only to differences in first moments .
The KQD framework extends this, as any mean-characteristic kernel is also quantile-characteristic, so KQDs distinguish all pairs that MMD can, and more. Notably, with centered (mean-subtracted) KQEs and ,
positioning the centered KQD as a sum of the classical MMD and a kernelized quantile-Wasserstein term.
If is the uniform measure on , then e-KQD coincides with a kernelized expected -sliced Wasserstein distance; for , , and uniform on the sphere, this recovers standard expected sliced Wasserstein. Similarly, sup-KQD under these conditions is a kernelized max-sliced Wasserstein.
4. Estimation Procedures and Computational Complexity
For empirical KQEs, the order-statistic estimator
satisfies with high probability for fixed .
Empirical e-KQD is estimated by finite Monte Carlo approximation of (). For , with probability :
A scalable estimator is implemented using "Gaussian directions": is taken as the projection of a centered Gaussian measure in , with and . Sampled functions are propagated to , and for directions, the main computational costs per direction are:
- for evaluating on points,
- for computing ,
- for sorting.
With , the total cost of e-KQD estimation is , near-linear in the sample size.
5. Empirical Evaluation and Comparative Performance
KQD-based distances were empirically assessed in nonparametric two-sample testing with permutation thresholds. Baselines were classical MMD with quadratic complexity, fast MMD approximations (MMD-lin, MMD-Multi), and kernel max-sliced Wasserstein.
Key findings across several datasets include:
- In a "power-decay" setting (multivariate Gaussian shift with increasing dimension), near-linear e-KQD retains power longer than MMD-Multi.
- In one-dimensional tests (Laplace vs. Gaussian, same first two moments, polynomial kernel not mean-characteristic), MMD fails to distinguish; KQDs succeed.
- On high-dimensional image data (Galaxy MNIST, CIFAR-10 vs. CIFAR-10.1), e-KQD and sup-KQD outperform fast MMD at comparable computational cost; the centered KQD matches full MMD performance.
- Type I error is maintained at the 5% level by permutation.
This suggests that KQDs, especially the efficient e-KQD estimator, offer a competitive and in several cases superior alternative to both classical MMD and fast MMD approximations, with rigorous metric and convergence guarantees (Naslidnyk et al., 26 May 2025).
6. Significance and Theoretical Implications
KQDs demonstrate that quantiles in RKHS can serve as a finer-grained representation of distributions than the mean function, circumventing the requirement for mean-characteristic kernels for metric properties. By unifying the kernel MMD and sliced Wasserstein paradigms, KQDs provide a general family of probability metrics with both theoretical rigor and practical efficiency. A plausible implication is that KQD-based methodologies may serve as the foundation for new nonparametric tests and generative model benchmarks where first-moment distinctions are insufficient.
7. Summary Table of Key Properties
| Property | KQDs (e-KQD, sup-KQD) | MMD |
|---|---|---|
| Representation | RKHS quantiles along directions | RKHS mean |
| Metric under kernel conditions | Weaker (quantile-characteristic suffices) | Strong (mean-characteristic) |
| Recovers kernelized slices | Yes (sliced Wasserstein limits) | No |
| Time complexity (fastest) | (U-stat) / (MMD-lin) | |
| Empirical distinguishing | Higher in several regimes | Fails for some non-mean differences |
These results position KQDs as a versatile and computationally attractive class of probability metrics, offering both theoretical generality and empirical power beyond MMD in a range of challenging regimes (Naslidnyk et al., 26 May 2025).