Sharp local sparsity of regularized optimal transport

Published 1 Apr 2026 in math.AP, math.NA, math.PR, and math.ST | (2604.00843v1)

Abstract: In recent years, the use of entropy-regularized optimal transport with $L^p$-type entropies has become increasingly popular. In this setting, the solutions are sparse, in the sense that the support of the regularized optimal coupling, $\mathrm{supp}(π\varepsilon)$, shrinks to the support of the original optimal transport problem as $\varepsilon \to 0$. The main open question concerns the rate of this convergence. In this paper, we obtain sharp local results away from the boundary. We prove that the supports $\mathrm{supp}(π\varepsilon(\cdot \mid x))$ of the conditional measures, $π_\varepsilon(\cdot \mid x)$, behave like balls of radius $\varepsilon^\frac 1 {d(p-1)+2}$. This allows us to show that the regularized potentials are uniformly strongly convex and to derive the rate of convergence of these potentials toward their unregularized limit. Our results generalize the results of (González-Sanz and Nutz, SIAM J.~Math.~Anal.) and (Wiesel and Xu, Ibid.) to the multivariate case and beyond the case of self-transport.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper establishes sharp local sparsity rates for conditionals of regularized optimal transport, with the support shrinking at the rate ε^(1/[d(p-1)+2]).
The authors employ convex-analytic and PDE-based techniques, leveraging Legendre duality and interior regularity to prove uniform strong convexity and convergence of ROT potentials.
The findings offer actionable insights for improving algorithmic stability and error analysis in high-dimensional regimes where regularized OT is applied.

Sharp Local Sparsity of Regularized Optimal Transport

Problem Setting and Motivation

This paper analyzes sharp local sparsity properties of regularized optimal transport (ROT) problems with $L^p$ -type entropy regularization for $p \in (1,2]$ . The considered problem is

${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$

where $h_p(z) = \tfrac{|z|^p -1}{p-1}$ , $\lambda$ , $\mu$ are probability measures with smooth densities on compact supports in $\mathbb{R}^d$ , and $\Pi(\lambda,\mu)$ denotes couplings matching both marginals. $L^p$ -type regularization includes quadratic regularization and interpolates between classical OT and entropic OT.

Unlike the entropic (EOT) case, where optimal plans have full support, the ROT plans are inherently sparse: for small $\varepsilon$ , the support of the ROT plan $p \in (1,2]$ 0 contracts toward the OT plan support. This sparsification is a fundamental feature exploited in various computational and statistical applications, as it enables reduced sample complexity and tractable computations with high-dimensional data.

One of the main open questions is to determine, locally and globally, sharp rates at which this contraction of support manifests as $p \in (1,2]$ 1, especially in the interior of the support, and establish corresponding quantitative regularity properties for the associated convex dual potentials.

Sharp Local Sparsity Rates

The core result provides precise asymptotics for the contraction of the support of regularized optimal couplings, in particular, for the fibers $p \in (1,2]$ 2. The paper proves that, for measures on $p \in (1,2]$ 3 with sufficiently regular densities and for $p \in (1,2]$ 4 in the interior of the support,

$p \in (1,2]$ 5

where $p \in (1,2]$ 6 is controlled by local geometry. This means each conditional support around the dual solution $p \in (1,2]$ 7 contracts like an Euclidean ball with radius scale $p \in (1,2]$ 8 as $p \in (1,2]$ 9. The rate is demonstrated to be sharp by explicit computations in the case of self-transport on tori and remains valid for all ${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$ 0 and dimension ${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$ 1.

An important technical innovation is the localized analysis: the authors establish these bounds away from the boundary of the support, and with explicit dependency on distance to the boundary. These quantitative estimates generalize and extend the one-dimensional rates in [Wiesel & Xu, SIAM J. Math. Anal., 2025] and previous work [Gonzalez-Sanz & Nutz, SIAM J. Math. Anal., forthcoming] to fully multivariate settings and general marginals.

Uniform Convexity and Convergence of Dual Potentials

A second principal contribution is the proof of uniform strong convexity (in the sense of lower bounds on the Hessian) for the regularized dual potential ${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$ 2 in interior regions. Specifically, for any compact ${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$ 3 in the interior,

for all ${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$ 5 and sufficiently small ${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$ 6, with ${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$ 7 depending on the distance to the boundary. This leverages sharp a priori regularity estimates for the ROT potential ${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$ 8, ensuring the restriction to the interior of the domain is both necessary and sufficient given the boundary layer structure of the ROT problem.

The authors further provide optimal ${\rm ROT}_{\varepsilon,p} := \inf_{\pi\in \Pi(\lambda,\mu), \; \pi \ll \lambda\otimes \mu} \int \tfrac{1}{2} \|x-y\|^2 \,d\pi + \varepsilon \int h_p \left(\frac{d\pi}{d(\lambda \otimes \mu)} \right) d(\lambda \otimes \mu)$ 9 convergence rates for the ROT transport maps: $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 0 where $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 1 is the unique potential for the classic OT problem and the distances are taken over $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 2 in the interior. The analysis combines convex geometric arguments with careful duality theory and differentiability properties of the regularized potentials.

Explicit Model and Sharpness

The main theoretical predictions are validated via explicit solutions in the case of self-transport with Lebesgue marginals on the torus $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 3, where the problem is exactly solvable. In this setting, the dual optimizer is constant, so the structure of $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 4 is readily analyzed. The asymptotics for the support diameter exactly match $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 5, demonstrating the sharpness of the derived rates extends beyond abstract upper/lower bounds to concrete examples.

Comparison with Prior Work

The results strictly strengthen prior local and global support contraction bounds for quadratic ROT ( $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 6) to arbitrary $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 7-ROT and general dimensions, providing effective control in the multivariate and non-symmetric non-self-transport settings. This unifies, sharpens, and extends earlier foundational studies on ROT sample complexity [Gonzalez-Sanz, del Barrio, Nutz, (González-Sanz et al., 12 Nov 2025)] and sparsity structure [Wiesel & Xu, SIAM J. Math. Anal., 2025; Zhang et al., (Zhang et al., 2023)].

A key distinction from the entropic regime ( $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 8) is emphasized: in contrast to EOT, where plans are never sparse and have full support independent of $h_p(z) = \tfrac{|z|^p -1}{p-1}$ 9, ROT produces couplings whose support indeed collapses to the OT solution at a quantifiable rate as regularization vanishes.

Implications and Future Directions

The established rates have direct implications for:

Algorithmic design: The sharp local contraction rate can be used to localize computations, resulting in scalable algorithms for high-dimensional regularized OT, exploiting support sparsity for computational efficiency.
Statistical analysis: Explicit rates yield near-optimal bounds for the sample complexity of ROT estimators, crucial in empirical applications such as domain adaptation, sample-based estimation, and generative modeling.
Theory of regularized variational problems: The methods extend to other divergence-regularized OT models, potentially yielding parallel results for more general cost functions and regularizers.

The uniform convexity results may inform new approaches in understanding regularization-induced smoothing in variational inference and PDE-based transport models, possibly connecting to the qualitative analysis of the porous medium equation as referenced in related work.

Potential extensions include:

Quantitative boundary layer estimates bridging interior and global results.
Broadening the analysis to non-Euclidean geometries and singular measures.
Application to structured machine learning tasks where induced sparsity of $\lambda$ 0 is desiderata.

Conclusion

This work delivers a mathematically rigorous characterization of the local geometric structure of regularized optimal transport with $\lambda$ 1-type entropies, providing precise, dimension- and regularization-dependent rates for support contraction, uniform strong convexity of dual potentials, and convergence to classic OT solutions. These results establish definitive benchmarks for both theoretical understanding and practical deployment of ROT in high-dimensional transport and statistical learning settings (2604.00843).