Kernel Quantile Discrepancies (KQDs)

Updated 4 December 2025

Kernel Quantile Discrepancies (KQDs) are statistical distances that embed quantiles in reproducing kernel Hilbert spaces to provide a finer representation of probability distributions.
They extend traditional kernel mean embedding methods by incorporating generalized quantiles, capturing higher-order distributional differences and subsuming sliced Wasserstein distances.
Efficient estimation procedures with near-linear computational cost make KQDs a robust alternative for two-sample testing and high-dimensional generative model evaluation.

Kernel Quantile Discrepancies (KQDs) are a family of statistical distances between probability distributions, constructed via the kernel quantile embedding (KQE) operator in reproducing kernel Hilbert spaces (RKHS). Extending the classical kernel mean embedding paradigm exemplified by maximum mean discrepancy (MMD), KQDs incorporate generalized quantiles as distributional features, yielding probability metrics under weaker kernel conditions, admitting efficient estimation, and subsuming kernelized forms of sliced Wasserstein distances (Naslidnyk et al., 26 May 2025).

1. Kernel Quantile Embedding Operator

Let $X$ denote a Hausdorff, separable, σ-compact Borel space and $k \colon X \times X \rightarrow \mathbb{R}$ a continuous, measurable, and separating kernel, with associated RKHS $\mathcal{H}$ and feature map $\psi(x) = k(x, \cdot)$ . The unit sphere in $\mathcal{H}$ , $S_\mathcal{H} = \{u \in \mathcal{H} : \|u\|_\mathcal{H} = 1\}$ , indexes the projections of $\psi(X)$ onto one-dimensional subspaces.

For a direction $u \in S_\mathcal{H}$ , define the one-dimensional pushforward measure $u\#P$ by $(u\#P)(B) = P(\{x : u(x) \in B\})$ . The usual quantile of level $\alpha \in [0,1]$ is

$\rho_{u\#P}^{\,\alpha} = \inf\{z \in \mathbb{R} : (u\#P)((-\infty, z]) \geq \alpha\}.$

The kernel quantile embedding of $P$ along direction $u$ and level $\alpha$ is then

$\rho_P^{\alpha,u} := \rho_{u\#P}^{\,\alpha} u \in \mathcal{H}.$

Equivalently, this is the $\alpha$ -quantile of the pushforward of the $\psi\#P$ in $\mathcal{H}$ along $u$ , leveraging the canonical representation $u(x) = \langle u, \psi(x) \rangle_\mathcal{H}$ .

2. Definitions and Properties of Kernel Quantile Discrepancies

For a probability measure $\nu$ with full support on $[0,1]$ (quantile levels) and $\gamma$ with full support on $S_\mathcal{H}$ (RKHS directions), define the directional quantile distance of order $p \geq 1$ between distributions $P, Q$ as

$\tau_p(P, Q; \nu, u) = \left( \int_0^1 \|\rho_P^{\alpha,u} - \rho_Q^{\alpha,u}\|_\mathcal{H}^p \, \nu(d\alpha) \right)^{1/p}.$

There are two aggregations:

Expected KQD (e-KQD $_p$ ):

$e\text{-KQD}_p(P, Q; \nu, \gamma) = \left( \mathbb{E}_{u \sim \gamma}[\tau_p(P, Q; \nu, u)^p] \right)^{1/p}.$

Supremum KQD (sup-KQD $_p$ ):

$\sup\text{-KQD}_p(P, Q; \nu) = \left( \sup_{u \in S_\mathcal{H}} \tau_p(P, Q; \nu, u)^p \right)^{1/p}.$

Provided the kernel assumptions and support conditions on $\nu$ and $\gamma$ , both e-KQD $_p$ and sup-KQD $_p$ are metrics on the space of probability measures over $X$ . The injectivity of $P \mapsto \{\rho_P^{\alpha,u}\}$ is established by an RKHS Cramér–Wold theorem, ensuring KQDs are distinguishing (zero KQD implies equality in law).

3. Relationships to MMD and Sliced Wasserstein Distances

The classical maximum mean discrepancy is defined as

$\mathrm{MMD}(P, Q) = \|\mu_P - \mu_Q\|_\mathcal{H}$

with $\mu_P$ the kernel mean embedding. MMD is sensitive only to differences in first moments $\mathbb{E}_P[u] - \mathbb{E}_Q[u]$ .

The KQD framework extends this, as any mean-characteristic kernel is also quantile-characteristic, so KQDs distinguish all pairs that MMD can, and more. Notably, with centered (mean-subtracted) KQEs and $p=2$ ,

$\widetilde{e\text{-KQD}_2^2}(P, Q) = \mathrm{MMD}^2(P, Q) + e\text{-KQD}_2^2(P, Q) - \mathbb{E}_u \left(\mathbb{E}_P[u] - \mathbb{E}_Q[u]\right)^2,$

positioning the centered KQD as a sum of the classical MMD and a kernelized quantile-Wasserstein term.

If $\nu$ is the uniform measure on $[0,1]$ , then e-KQD $_p$ coincides with a kernelized expected $p$ -sliced Wasserstein distance; for $X = \mathbb{R}^d$ , $k(x, y) = x^\top y$ , and uniform $\gamma$ on the sphere, this recovers standard expected sliced Wasserstein. Similarly, sup-KQD $_p$ under these conditions is a kernelized max-sliced Wasserstein.

4. Estimation Procedures and Computational Complexity

For empirical KQEs, the order-statistic estimator

$\rho_{u\#P_n}^\alpha = [u(x_{1:n})]_{\lceil \alpha n \rceil}$

satisfies $\|\rho_{P_n}^{\alpha,u} - \rho_P^{\alpha,u}\|_\mathcal{H} = O(n^{-1/2})$ with high probability for fixed $\alpha, u$ .

Empirical e-KQD $_p$ is estimated by finite Monte Carlo approximation of $\gamma$ ( $u_1,\ldots,u_\ell$ ). For $p=1$ , with probability $1-\delta$ : $\left|e\text{-KQD}_1(P_n, Q_n; \nu, \gamma_\ell) - e\text{-KQD}_1(P, Q; \nu, \gamma)\right| \leq C(\delta)(\ell^{-1/2} + n^{-1/2}).$

A scalable estimator is implemented using "Gaussian directions": $\gamma$ is taken as the projection of a centered Gaussian measure $N(0, C)$ in $\mathcal{H}$ , with $C[f](x) = \int_X k(x, y) f(y) \xi(dy)$ and $z_j \sim \xi$ . Sampled functions $f(x)$ are propagated to $u = f/\|f\|_\mathcal{H}$ , and for $\ell$ directions, the main computational costs per direction are:

$\mathcal{O}(nm)$ for evaluating $f$ on $n$ points,
$\mathcal{O}(m^2)$ for computing $\|f\|_\mathcal{H}$ ,
$\mathcal{O}(n\log n)$ for sorting.

With $\ell = m = \log n$ , the total cost of e-KQD estimation is $\mathcal{O}(n\log^2 n)$ , near-linear in the sample size.

5. Empirical Evaluation and Comparative Performance

KQD-based distances were empirically assessed in nonparametric two-sample testing with permutation thresholds. Baselines were classical MMD with quadratic complexity, fast MMD approximations (MMD-lin, MMD-Multi), and kernel max-sliced Wasserstein.

Key findings across several datasets include:

In a "power-decay" setting (multivariate Gaussian shift with increasing dimension), near-linear e-KQD retains power longer than MMD-Multi.
In one-dimensional tests (Laplace vs. Gaussian, same first two moments, polynomial kernel not mean-characteristic), MMD fails to distinguish; KQDs succeed.
On high-dimensional image data (Galaxy MNIST, CIFAR-10 vs. CIFAR-10.1), e-KQD $_2$ and sup-KQD outperform fast MMD at comparable computational cost; the centered $O(n^2)$ KQD matches full MMD performance.
Type I error is maintained at the 5% level by permutation.

This suggests that KQDs, especially the efficient e-KQD estimator, offer a competitive and in several cases superior alternative to both classical MMD and fast MMD approximations, with rigorous metric and convergence guarantees (Naslidnyk et al., 26 May 2025).

6. Significance and Theoretical Implications

KQDs demonstrate that quantiles in RKHS can serve as a finer-grained representation of distributions than the mean function, circumventing the requirement for mean-characteristic kernels for metric properties. By unifying the kernel MMD and sliced Wasserstein paradigms, KQDs provide a general family of probability metrics with both theoretical rigor and practical efficiency. A plausible implication is that KQD-based methodologies may serve as the foundation for new nonparametric tests and generative model benchmarks where first-moment distinctions are insufficient.

7. Summary Table of Key Properties

Property	KQDs (e-KQD, sup-KQD)	MMD
Representation	RKHS quantiles along directions	RKHS mean
Metric under kernel conditions	Weaker (quantile-characteristic suffices)	Strong (mean-characteristic)
Recovers kernelized slices	Yes (sliced Wasserstein limits)	No
Time complexity (fastest)	$\mathcal{O}(n\log^2 n)$	$\mathcal{O}(n^2)$ (U-stat) / $\mathcal{O}(n)$ (MMD-lin)
Empirical distinguishing	Higher in several regimes	Fails for some non-mean differences

These results position KQDs as a versatile and computationally attractive class of probability metrics, offering both theoretical generality and empirical power beyond MMD in a range of challenging regimes (Naslidnyk et al., 26 May 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Kernel Quantile Embeddings and Associated Probability Metrics (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Kernel Quantile Discrepancies (KQDs).