Structured Transforms Across Spaces with Cost-Regularized Optimal Transport

Published 9 Nov 2023 in cs.LG, math.OC, and stat.ML | (2311.05788v2)

Abstract: Matching a source to a target probability measure is often solved by instantiating a linear optimal transport (OT) problem, parameterized by a ground cost function that quantifies discrepancy between points. When these measures live in the same metric space, the ground cost often defaults to its distance. When instantiated across two different spaces, however, choosing that cost in the absence of aligned data is a conundrum. As a result, practitioners often resort to solving instead a quadratic Gromow-Wasserstein (GW) problem. We exploit in this work a parallel between GW and cost-regularized OT, the regularized minimization of a linear OT objective parameterized by a ground cost. We use this cost-regularized formulation to match measures across two different Euclidean spaces, where the cost is evaluated between transformed source points and target points. We show that several quadratic OT problems fall in this category, and consider enforcing structure in linear transform (e.g. sparsity), by introducing structure-inducing regularizers. We provide a proximal algorithm to extract such transforms from unaligned data, and demonstrate its applicability to single-cell spatial transcriptomics/multiomics matching tasks.

Abstract PDF HTML Upgrade to Chat

References (63)

Citations (4)

View on Semantic Scholar

Summary

The paper demonstrates equivalence between quadratic OT formulations like Gromov-Wasserstein and cost-regularized OT to impose structured transforms across high-dimensional spaces.
The paper introduces Prox-ROT, a proximal algorithm leveraging sparsity and low-rank regularizers to efficiently handle structured linear transforms in complex datasets.
The method extends Monge map theory to linear, cost-regularized settings and shows strong numerical improvements in single-cell and spatial transcriptomics data integration.

Structured Transforms Across Spaces with Cost-Regularized Optimal Transport

The paper "Structured Transforms Across Spaces with Cost-Regularized Optimal Transport" addresses a significant challenge in the field of optimal transport (OT): matching probability distributions across heterogeneous high-dimensional spaces without a predefined ground cost function. The authors propose an innovative approach to leveraging regularized optimal transport to solve such problems effectively, including those encountered in fields like single-cell spatial transcriptomics and multi-omics integration.

Overview of Contributions

Cost-Regularized OT Framework: The paper provides a theoretical foundation by demonstrating equivalence between quadratic OT formulations—like Gromov-Wasserstein (GW)—and cost-regularized OT problems. This equivalence allows imposing structure in transportation tasks across different Euclidean spaces.
Structured Linear Transforms: The authors propose a proximal algorithm, Prox-ROT, that enforces structure on the linear transformations across spaces. This is achieved by utilizing structure-inducing regularizers, such as sparsity and low-rank constraints, which are particularly beneficial for high-dimensional data.
Monge Map Existence and Extension: The work extends the framework of Monge maps, traditionally used in OT problems with same-space distributions, to linear cost-regularized problems across spaces. This contributes new theoretical insights into the existence and practical computation of Monge maps in these settings.
Practical Applications and Numerical Results: The paper applies its theoretical insights to real-world datasets, showing improvements over existing methods in tasks like integrating multi-omics single-cell data and addressing spatial transcriptomics.

Strong Numerical Results and Implications

The paper reports enhanced performance in single-cell data integration tasks measured by Label Transfer Accuracy and demonstrates efficient handling of high-dimensional spatial transcriptomics data through adaptive feature selection and low-rank solution spaces. These results suggest the method's practicality and efficiency in processing large datasets common in modern biological research.

Implications for Future Work

This research opens pathways to further exploration of regularization techniques within the field of optimal transport, particularly for applications in bioinformatics. Future work may consider extending these techniques to neural network architectures or exploring different types of regularization approaches to tackle even more complex multimodal datasets.

In summary, the paper presents a methodologically sound approach to handling complex multivariate distribution matching problems by bridging gaps between linear and quadratic OT frameworks through cost-regularization. By offering theoretical results along with practical algorithmic solutions, this work significantly enriches the toolkit for researchers dealing with high-dimensional, heterogeneous data in fields like computational biology and beyond.

Markdown Report Issue