Monge–Kantorovich Metric

Updated 29 January 2026

The Monge–Kantorovich metric is a distance measure between probability measures, defined by the minimal transportation cost via both map-based (Monge) and plan-based (Kantorovich) formulations.
It leverages duality theories and linear programming methods to compute optimal transport plans, ensuring robust applications in analysis, PDEs, and geometric settings.
Advanced generalizations extend the metric to noncommutative, vector-valued, and distributional settings, providing scalable computational frameworks and deep insights into optimal transport problems.

The Monge–Kantorovich metric, frequently denoted as the Wasserstein distance, is a fundamental tool in optimal transport theory, metrizing the space of probability measures on a metric space through the minimal transportation cost required to match one measure to another with respect to an underlying cost function—typically a distance. Originally posed by Monge in the 18th century as the problem of optimally moving mass, the modern Monge–Kantorovich framework places dual emphasis on both the map-based (“Monge”) and plan-based (“Kantorovich”) approaches, underpinned by rich duality theory and deep connections to analysis, PDEs, geometry, probability, and mathematical physics. The metric displays profound structural and analytic properties, is robust to generalizations (including vector, non-commutative, and distributional measures), and serves as a foundation for both theoretical advances and computational algorithms across pure and applied mathematics.

1. Primal and Dual Formulations

Let $(X,d)$ be a complete separable metric space and $\mu, \nu$ be Borel probability measures on $X$ . The Monge–Kantorovich distance of order 1 (also known as the Wasserstein-1 or earth mover’s distance) is defined in the Kantorovich (plan) formulation as

$W_1(\mu,\nu) = \inf_{\pi \in \Pi(\mu,\nu)} \int_{X \times X} d(x, y) \, d\pi(x, y),$

where $\Pi(\mu, \nu)$ is the set of couplings (transport plans) with marginals $\mu$ and $\nu$ (Kobayashi, 2019). For general costs $c(x, y)$ , the classical Kantorovich problem is

$W_c(\mu, \nu) = \inf_{\pi \in \Pi(\mu,\nu)} \int_{X \times X} c(x, y)\, d\pi(x, y),$

with existence of optimal plans guaranteed by compactness and lower semicontinuity arguments (Frungillo, 2024).

The Monge formulation seeks a measurable map $T: X \to X$ with $T_\#\mu = \nu$ and equates the cost

$I_M(T) = \int_X d(x, T(x))\, d\mu(x).$

In general, $W_1(\mu, \nu) \leq \inf_{T_\#\mu = \nu} I_M(T)$ , with equality holding only when an optimal plan is concentrated on the graph of a measurable map $T$ (i.e., the “plan splits no mass”) (Kobayashi, 2019).

The dual (Kantorovich–Rubinstein) formulation is

$W_1(\mu, \nu) = \sup_{f \in \mathrm{Lip}_1(X)} \left( \int f\, d\mu - \int f\, d\nu \right),$

where $\mathrm{Lip}_1(X)$ denotes the set of all real-valued functions on $X$ with Lipschitz constant at most 1 (Martinetti, 2012, Rigo, 2019).

On finite spaces or with a finite cost matrix $C=(c_{ij})$ , the problem becomes a linear program (LP), where the primal variables are transportation tables and dual variables correspond to Kantorovich potentials (Bloch et al., 17 Apr 2025, Pistone et al., 2020).

2. Metric Properties and Banach Space Extensions

The Monge–Kantorovich metric satisfies

Positivity: $W_1(\mu, \nu) \geq 0$ with equality iff $\mu = \nu$ .
Symmetry: $W_1(\mu, \nu) = W_1(\nu, \mu)$ .
Triangle inequality: $W_1(\mu, \rho) \leq W_1(\mu, \nu) + W_1(\nu, \rho)$ (Frungillo, 2024, Pistone et al., 2020).

The MK metric induces the topology of weak convergence on probability measures and can be generalized to a norm (the Monge–Kantorovich norm) on the space of finite signed Radon measures, using either the Kantorovich–Rubinstein dual or transport-plan primal representations (Terjék, 2021). Specifically, for a signed measure $\mu$ ,

$\|\mu\|_{MK} = \sup\left\{ \int_X f\, d\mu : \mathrm{Lip}(f) \leq 1,\ \|f\|_\infty \leq 1 \right\},$

with an equivalent transport-plan minimization via the Hahn decomposition $\mu = \mu^+ - \mu^-$ ,

$\|\mu\|_{MK} = \inf_{\pi \in \Pi(\mu^+, \mu^-)} \int d(x, y)\, d\pi(x, y).$

The MK norm is equivalent to other classical norms such as the Hanin norm and extends in a Banach-space-convex-analytic manner (Terjék, 2021).

On spaces of vector-valued measures (e.g., Hilbert space-valued measures), the Monge–Kantorovich and related norms have been developed. For $X$ -valued, bounded variation, countably additive measures $\mu$ , the MK norm is

$\|\mu\|_{MK} = \sup\left\{ \left|\int_T (f(t) \mid d\mu(t)) \right| : f \in L_1(T, X), \|f\|_{BL} \leq 1 \right\},$

where $\|\cdot\|_{BL}$ is the sum of the sup and Lipschitz constants (Chitescu et al., 2014). These norms are central in the study of vector measure convergence and generalized metrics on compact spaces.

3. Existence and Uniqueness of Optimal Maps

The existence of an optimal Monge map is closely tied to regularity assumptions:

If the source measure $\mu_1$ is absolutely continuous with respect to $\mathcal{L}^n$ (Lebesgue), and the underlying space is a bounded convex $G_\delta$ -subset of $\mathbb{R}^n$ with a metric $\rho$ such that (i) $\rho$ -topology coincides with the Euclidean, (ii) Euclidean segments are $\rho$ -geodesics, and (iii) $\mathcal{L}^n$ is locally doubling for $\rho$ , then there exists a $\rho$ -optimal transport map $T$ with $T_\#\mu_1 = \mu_2$ and

$\int_\Omega \rho(x, T(x))\, d\mu_1(x) = W_1(\mu_1, \mu_2)$

(Kobayashi, 2019).

For quadratic cost in geodesic metric measure spaces, further regularity (strong nonbranching, doubling, entropy bounds) yields the existence and uniqueness of optimal maps (Brenier-type theorems). The crucial step is showing that for an optimal geodesic plan $\pi$ and Kantorovich potential $\varphi$ , the "slope" of $\varphi$ along geodesics equals the displacement cost; uniqueness of transport rays ensures that optimal plans are induced by maps (Ambrosio et al., 2011).

Even in spaces with highly branching geodesics (e.g., Hilbert geometries), the combination of doubling, sufficient Euclidean segments, and regularization leads to map solutions, simplifying numerical computation of $W_1$ (Kobayashi, 2019).

4. Advanced Generalizations: Noncommutative, Distributional, and Matrix-Valued Metrics

Noncommutative MK Metric

The Monge–Kantorovich paradigm is extended to noncommutative settings via Connes’ spectral distance: for a spectral triple $(\mathcal{A}, \mathcal{H}, D)$ , the spectral distance is

$d_D(\varphi, \psi) = \sup_{\|[D, \pi(a)]\| \leq 1} |\varphi(a) - \psi(a)|$

for states $\varphi, \psi$ on $\mathcal{A}$ . A Monge–Kantorovich–like distance $W_D$ is defined on the state space by duality with "Lipschitz elements" of $\mathcal{A}$ (Martinetti, 2012). For certain examples (e.g., $M_2(\mathbb{C})$ , two-sheet models), $W_D$ coincides with $d_D$ , and the construction provides an optimal-transport interpretation of metric features in noncommutative geometry, including physical interpretations (e.g., the Higgs field as a transport cost).

Matrix Monge–Kantorovich metric

On the space of positive-definite density matrices, a noncommutative Wasserstein metric is constructed by imposing a quantum continuity equation with Lindblad operators and an action functional—yielding a convex optimization problem exhibiting strong duality and metric properties paralleling the classical case. This includes generalizations of Poincaré–Wirtinger inequalities and dynamic Hamilton–Jacobi–type results (Chen et al., 2017).

Distributions and Generalized Distances

The transshipment (or unbalanced optimal transport) problem extends the MK metric to differences $f$ in suitable Banach spaces of distributions (e.g., $X_0(\Omega)$ : distributions of order one with zero average). The dual formulation is

$w_1(f) = \sup\left\{ \langle f, u \rangle : \|Du\|_\infty \leq 1 \right\},$

and a primal minimization over divergence-constrained vector measures recovers the desired metric structure and existence of optimal transport densities even for infinite dipole sums (Bouchitté et al., 2013).

5. Computational Aspects and Recent Extensions

Computational methods for MK distances commonly rely on linear programming or convex optimization. On finite spaces, the problem reduces exactly to an LP, and stochastic (MCMC) solvers leveraging table moves connect any pair of feasible couplings, ensuring ergodicity for annealing algorithms (Pistone et al., 2020). Fully discrete schemes for arbitrary compact metric spaces provably approximate both the optimal value and optimal transport maps, with explicit error bounds driven by the continuity modulus of the cost function and leveraging barycentric or geometric-median projections (Frungillo, 2024).

Generalizations include multi-marginal barycenter problems in classical and fibered/parametrized settings. The disintegrated Monge–Kantorovich metrics $D_{p,q}$ on metric fiber bundles (parameterized over a base $\Omega$ with fiber $Y$ ) extend the classical $W_p$ by considering fiber-wise metrics and aggregating via $L^q$ norms. These metrics unify the classical, linearized, and sliced Wasserstein distances, enjoy completeness and geodesic properties, possess strong duality, and deliver existence and uniqueness of barycenters under minimal geometric assumptions (notably on arbitrary complete Riemannian manifolds, without curvature or injectivity-radius restrictions) (Kitagawa et al., 2024).

6. Connections to Discrete, Polytope, and Majorization Structures

In discrete settings, the Monge–Kantorovich metric is intimately connected to the geometry of doubly stochastic matrices, permutohedra, and majorization theory. The LP duality exposes structural links to Schur–Horn and Birkhoff–von Neumann theorems: couplings correspond to convex combinations of permutation matrices (i.e., points in the permutohedron), and the cost-minimizers relate to extremal points under majorization (Bloch et al., 17 Apr 2025). This correspondence extends to infinite settings through rearrangement theory and doubly stochastic operators on $L^1([0,1])$ , providing a robust convex-analytic backbone to optimal transport on general measure spaces.

7. Significance and Outlook

The Monge–Kantorovich metric uniquely exhibits a blend of analytic, geometric, and convex-optimization features underpinning a wide span of theoretical developments and computational applications. Its power derives from duality structure, topological properties, and capacity for generalization—accommodating vector, noncommutative, and distributional settings, as well as fibered parametrizations and barycentric operations. The metric’s flexibility and analytic structure support both high-level theoretical advances (e.g., in geometric analysis and mathematical physics) and robust, scalable computation. Recent work eliminates restrictive geometric hypotheses for barycenter uniqueness, harnesses disintegration and fiber bundle techniques, and extends optimal transport duality to fundamentally new settings (Kitagawa et al., 2024, Kobayashi, 2019, Chen et al., 2017, Martinetti, 2012, Bouchitté et al., 2013, Chitescu et al., 2014, Terjék, 2021).