Riemannian Neural OT Maps
- Riemannian Neural OT maps are neural-network parameterizations of optimal transport on Riemannian manifolds that leverage the manifold’s geometry for continuous and equivariant solutions.
- They use c-concave potentials and neural embeddings to efficiently solve the Monge OT problem while ensuring symmetry and reduced computational complexity compared to discrete methods.
- Empirical studies show that RNOT maps deliver state-of-the-art performance in generative modeling, scientific simulations, and geometric operator learning with significant speed and memory gains.
Riemannian Neural Optimal Transport (RNOT) maps are neural-network-based parameterizations of optimal transport (OT) maps between probability measures on Riemannian manifolds. By leveraging both geometric structure and scalable machine learning architectures, RNOT maps enable tractable, continuous, and equivariant solutions to the Monge OT problem in non-Euclidean geometries—central for applications in generative modeling, scientific simulation, and geometric operator learning (Rezende et al., 2021, Micheli et al., 3 Feb 2026, Li et al., 26 Jul 2025). This article surveys mathematical foundations, neural architectures, numerical and statistical properties, empirical performance, and practical integration into geometric operator learning.
1. Mathematical Foundations of Riemannian Neural OT
The goal is to solve the quadratic-cost Monge OT problem on a compact Riemannian manifold $(\M,g)$: $\inf_{T:\,T_\#\mu=\nu} \int_{\M} \frac{1}{2}d\bigl(x,T(x)\bigr)^2\,\d\mu(x)$ where and are probability measures absolutely continuous with respect to the Riemannian volume, and is the geodesic distance induced by .
By McCann’s theorem, the unique optimal map is induced by a -concave potential : with , and
$\phi^c(x) = \inf_{y\in\M}\bigl(\frac{1}{2}d(x,y)^2 - \phi(y)\bigr)$
The Kantorovich (dual, or semi-dual) problem maximized over -concave potentials characterizes the OT cost and the structure of optimal maps. In practice, this machinery underpins all Riemannian Neural OT constructions (Micheli et al., 3 Feb 2026, Rezende et al., 2021).
2. Neural Parameterization and Architectural Design
RNOT frameworks parameterize the potential (or pre-potential) using neural networks. The construction proceeds by embedding $\M$ into Euclidean space using a continuous injective landmark map: where are landmark points on $\M$ (e.g., via farthest-point sampling), ensuring is a topological embedding for large enough .
A neural network parameterizes a pre-potential . The -concave potential is then given by the -transform: $\phi_\theta(x) = \inf_{y\in\M}\Big(\frac{1}{2}d(x,y)^2 - \psi_\theta(y)\Big)$ which is solved via Riemannian gradient descent on $\M$. The transport map is
By enforcing -invariance in for a symmetry (isometry) group of $\M$, the resulting map is exactly -equivariant. Explicitly, if , then (Rezende et al., 2021, Micheli et al., 3 Feb 2026).
For surface-type manifolds, RNOT architectures utilize local charted coordinates, stacked multilayer perceptrons (MLPs), or vector field representations, always ensuring outputs remain within the manifold via the exponential map. “Symmetric MLP” architectures can enforce invariance under continuous symmetries by reducing input dimensionality accordingly (Rezende et al., 2021).
3. Nested Optimization and Training
The core optimization in RNOT methods is over neural network parameters , targeting the semi-dual Kantorovich objective: $\cJ(\psi_\theta) = \mathbb{E}_{x \sim \mu}[\psi_\theta^c(x)] + \mathbb{E}_{y\sim\nu}[\psi_\theta(y)]$ Optimizing $\cJ$ involves, for each batch sample , a minimization over $y\in\M$ to compute the -transform. This is realized using Riemannian gradient descent initialized by softmin or LogSumExp over target samples, and the subgradient or envelope theorem provides efficient backpropagation without differentiating through the inner loop (Micheli et al., 3 Feb 2026).
Stability and uniqueness of the -minimization are ensured if the gradient and Hessian of the neural pre-potential satisfy certain bounds (e.g., and sufficiently small relative to curvature), ensuring the minimization lands in a geodesic ball where standard gradient descent finds the unique minimum (Rezende et al., 2021).
Regularization can be applied via penalties on the integrated squared gradient or entropy of the push-forward, but in canonical settings pure manifold geometry and potential smoothness suffice (Li et al., 26 Jul 2025).
4. Statistical and Computational Guarantees
A foundational result is that any discretization-based OT map (e.g., barycentric projection in mesh-based solvers) suffers the curse of dimensionality: to reach RMSE on a -dimensional manifold requires parameters, reflecting exponential scaling with (Micheli et al., 3 Feb 2026).
By contrast, under mild smoothness (e.g., $\psi_* \in C^{kp,1}(\M)$) and geometric conditions (compactness, bounded sectional curvature), RNOT maps can approximate the optimal transport map to uniform error with neural networks of width and depth , yielding sub-exponential complexity in dimension. Error in the neural potential transfers to the OT map at rate almost everywhere (Micheli et al., 3 Feb 2026).
Computationally, each training iteration requires only a single inner minimization and Riemannian exponential map, with no repeated ODE solutions or stochastic trace estimators needed, contrasting favorably with ODE-based flows (Rezende et al., 2021). Sample complexity and wall-clock time benchmarks show 2–3 speedup over equivariant ODE-flows with comparable (or improved) KL divergence and effective sample size performance.
5. Empirical Performance and Applications
Comprehensive experiments have evaluated RNOT maps on both synthetic distributions and scientific applications:
- Synthetic density estimation: On $\T^2$ and , RNOT methods achieve effective sample size (ESS) and forward KL divergence below $0.003$ nats, matching or outperforming baseline normalizing flows and Riemannian continuous normalizing flows (RCNF). Models demonstrating symmetry-adapted architecture further improve ESS (from to $97$–) (Rezende et al., 2021).
- Continental drift on $\Sph^2$: Trained RNOT maps reconstruct paleocontinental mass transport over 150 million years, recovering major tectonic boundaries as OT flows on the sphere (Micheli et al., 3 Feb 2026).
- Geometric operator learning: Embedding complex surfaces into a uniform latent 2D domain via diffeomorphic RNOT maps, then applying neural operators (like Fourier Neural Operators), yields state-of-the-art acceleration and memory efficiency on 3D flows (e.g., RANS for aerodynamics). Experiments on ShapeNet-Car and DrivAerNet-Car show up to 8 speedups and 7 memory savings compared to volumetric geometric operator baselines, without loss in PDE prediction accuracy (Li et al., 26 Jul 2025).
6. Integration into Geometric Operator Pipelines
The RNOT approach generalizes mesh-based geometry learning by embedding the data manifold into a canonical domain through instance-dependent OT, rather than aligned interpolation or shared deformation. Surface data is transformed through the OT map, processed by a latent-space operator, and mapped back, enabling efficient and measure-preserving neural operator evaluation.
Training involves a composite loss integrating data-fitting, OT regularization, and optionally cycle-consistency or smoothing. Differentiable entropic OT solvers (e.g., Sinkhorn) or Monge map approximations are used for the inner OT map computation, and all components are differentiable for gradient-based optimization (Li et al., 26 Jul 2025).
7. Theoretical and Algorithmic Summary
Riemannian Neural OT maps constitute a principled, theoretically backed class of generative and geometric learning architectures that:
- Avoid the curse of dimensionality endemic to discretized OT methods by using continuous neural potentials
- Provide explicit polynomial bounds for approximation accuracy under mild geometric regularity
- Admit efficient, symmetry-preserving map construction through explicit neural architectures and gradient-based optimization on manifolds
- Enable direct integration into simulation and operator learning pipelines, showing empirical state-of-the-art performance in diverse geometric and scientific domains
A summary table of core design features and comparative statistics:
| Feature | RNOT (Neural OT) | Discrete RCPM (Mesh, Sinkhorn) |
|---|---|---|
| Parametrization | Neural potential via MLP/embedding | Mesh/grid; OT plan |
| Complexity (to error ) | Sub-exponential in | Exponential in |
| Symmetry handling | Explicit -equivariance | Mesh/interp. dependent |
| Loss/Training | Semi-dual OT (no Jacobian log) | Entropic OT with plan |
| Scalability | Up to dimensions () | Rapidly degrades with |
These capabilities and guarantees position Riemannian Neural OT as a central tool for scalable, structure-aware learning in the physical sciences, manifold-based generative modeling, and scientific computation (Rezende et al., 2021, Micheli et al., 3 Feb 2026, Li et al., 26 Jul 2025).