Structured Transforms Across Spaces with Cost-Regularized Optimal Transport
Abstract: Matching a source to a target probability measure is often solved by instantiating a linear optimal transport (OT) problem, parameterized by a ground cost function that quantifies discrepancy between points. When these measures live in the same metric space, the ground cost often defaults to its distance. When instantiated across two different spaces, however, choosing that cost in the absence of aligned data is a conundrum. As a result, practitioners often resort to solving instead a quadratic Gromow-Wasserstein (GW) problem. We exploit in this work a parallel between GW and cost-regularized OT, the regularized minimization of a linear OT objective parameterized by a ground cost. We use this cost-regularized formulation to match measures across two different Euclidean spaces, where the cost is evaluated between transformed source points and target points. We show that several quadratic OT problems fall in this category, and consider enforcing structure in linear transform (e.g. sparsity), by introducing structure-inducing regularizers. We provide a proximal algorithm to extract such transforms from unaligned data, and demonstrate its applicability to single-cell spatial transcriptomics/multiomics matching tasks.
- Towards optimal transport with global invariances. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1870–1879. PMLR.
- Convex analysis and monotone operator theory in hilbert spaces. CMS Books in Mathematics.
- Bertsekas, D. P. (1997). Nonlinear programming. Journal of the Operational Research Society, 48(3):334–334.
- Smooth and sparse optimal transport. In Storkey, A. and Perez-Cruz, F., editors, Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, volume 84 of Proceedings of Machine Learning Research, pages 880–889. PMLR.
- Sliced and radon wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision, 51:22–45.
- Brenier, Y. (1991). Polar factorization and monotone rearrangement of vector-valued functions. Communications on pure and applied mathematics, 44(4):375–417.
- Learning single-cell perturbation responses using neural optimal transport. Nature Methods, pages 1–10.
- Single-cell multimodal profiling reveals cellular epigenetic heterogeneity. Nature methods, 13(10):833–836.
- The proximity operator repository. user’s guide.
- Faster wasserstein distance estimation with the sinkhorn divergence. Advances in Neural Information Processing Systems, 33:2257–2269.
- The earth mover’s distance under transformation sets. In Proceedings of the Seventh IEEE International Conference on Computer Vision, volume 2, pages 1076–1083. IEEE.
- Joint distribution optimal transportation for domain adaptation. Advances in neural information processing systems, 30.
- Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26:2292–2300.
- Monge, Bregman and occam: Interpretable optimal transport in high-dimensions with feature-sparse maps. In Proceedings of the 40th International Conference on Machine Learning, volume 202, pages 6671–6682. PMLR.
- De Philippis, G. (2013). Regularity of optimal transport maps and applications, volume 17. Springer Science & Business Media.
- Scot: single-cell multi-omics alignment with optimal transport. Journal of Computational Biology, 29(1):3–18.
- Max-sliced wasserstein distance and its use for gans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10648–10656.
- Dudley, R. M. (1966). Weak convergence of probabilities on nonseparable metric spaces and empirical measures on euclidean spaces. Illinois Journal of Mathematics, 10(1):109–126.
- On the existence of monge maps for the gromov-wasserstein distance. arXiv preprint arXiv:2210.11945.
- Sample complexity of sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1574–1583.
- Learning generative models with sinkhorn divergences. In International Conference on Artificial Intelligence and Statistics, AISTATS 2018, pages 1608–1617.
- Unsupervised alignment of embeddings with wasserstein procrustes. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1880–1890. PMLR.
- Multi-subject meg/eeg source imaging with sparse multi-task regression. NeuroImage, 220:116847.
- Mapping cells through time and space with moscot. bioRxiv, pages 2023–05.
- Learning costs for structured monge displacements. arXiv preprint arXiv:2306.11895.
- Multimodal single cell data integration challenge: Results and lessons learned. In NeurIPS 2021 Competitions and Demonstrations Track, pages 162–176. PMLR.
- Entropic gromov-wasserstein between Gaussian distributions. In International Conference on Machine Learning, pages 12164–12203. PMLR.
- Tree-sliced variants of wasserstein distances. Advances in neural information processing systems, 32.
- Flow matching for generative modeling. In The Eleventh International Conference on Learning Representations.
- Sparsity-constrained optimal transport. In The Eleventh International Conference on Learning Representations.
- Mémoli, F. (2011). Gromov–wasserstein distances and the metric approach to object matching. Foundations of computational mathematics, 11(4):417–487.
- Monge, G. (1781). Mémoire sur la théorie des déblais et des remblais. Mem. Math. Phys. Acad. Royale Sci., pages 666–704.
- Wasserstein training of restricted boltzmann machines. Advances in Neural Information Processing Systems, 29.
- Action matching: Learning stochastic dynamics from samples. In International conference on machine learning.
- Action matching: Learning stochastic dynamics from samples. In Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., and Scarlett, J., editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 25858–25889. PMLR.
- Neural gromov-wasserstein optimal transport. arXiv preprint arXiv:2303.05978.
- Estimation of wasserstein distances in the spiked transport model. Bernoulli, 28(4):2663–2688.
- Subspace robust wasserstein distances. In International conference on machine learning, pages 5072–5081. PMLR.
- Regularized optimal transport is ground cost adversarial. In International Conference on Machine Learning, pages 7532–7542. PMLR.
- Gromov-wasserstein averaging of kernel and distance matrices. In International conference on machine learning, pages 2664–2672. PMLR.
- Entropic estimation of optimal transport maps. arXiv preprint arXiv:2109.12004.
- Entropic gromov-wasserstein distances: Stability, algorithms, and distributional limits. arXiv preprint arXiv:2306.00182.
- Improving gans using optimal transport. In International Conference on Learning Representations.
- Gromov-wasserstein distances between Gaussian distributions. Journal of Applied Probability, 59(4).
- Santambrogio, F. (2015). Optimal transport for applied mathematicians. Birkäuser, NY, 55(58-63):94.
- Low-rank sinkhorn factorization. In International Conference on Machine Learning, pages 9344–9354. PMLR.
- Unbalanced low-rank optimal transport solvers. arXiv preprint arXiv:2305.19727.
- Linear-time gromov Wasserstein distances using low rank couplings and costs. In Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., and Sabato, S., editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 19347–19365. PMLR.
- Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell, 176(4):928–943.
- Spatial atlas of the mouse central nervous system at molecular resolution. Nature, pages 1–10.
- Certifying some distributional robustness with principled adversarial training. In International Conference on Learning Representations.
- Sturm, K.-T. (2006). On the geometry of metric measure spaces. Acta Mathematica, 196(1):65–131.
- Trajectorynet: A dynamic optimal transport network for modeling cellular dynamics. In International conference on machine learning, pages 9526–9536. PMLR.
- Improving and generalizing flow-based generative models with minibatch optimal transport. In ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems.
- Painless stochastic gradient: Interpolation, line-search, and convergence rates. Advances in neural information processing systems, 32:3732–3745.
- Vayer, T. (2020). A contribution to optimal transport on incomparable spaces. arXiv preprint arXiv:2011.04447.
- Fused gromov-wasserstein distance for structured objects. Algorithms, 13(9):212.
- Villani, C. (2003). Topics in optimal transportation. Graduate Studies in Mathematics.
- Sharp asymptotic and finite-sample rates of convergence of empirical measures in wasserstein distance. Bernoulli, 25(4A).
- Matcher: manifold alignment reveals correspondence between single cell transcriptome and epigenome dynamics. Genome biology, 18(1):1–19.
- Wasserstein adversarial examples via projected sinkhorn iterations. In International Conference on Machine Learning, pages 6808–6817. PMLR.
- Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1934–1945.
- Gromov-wasserstein distances: Entropic regularization, duality, and sample complexity. arXiv preprint arXiv:2212.12848.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.