Stochastic Optimal Transport
- Stochastic Optimal Transport is a framework for optimizing mass transport under randomness by incorporating stochastic dynamics, random costs, and control mechanisms.
- It employs variational formulations, duality methods, and PDE strategies like Hamilton-Jacobi equations to tackle transport problems under uncertainty.
- Advanced numerical methods such as stochastic gradient, entropic regularization, and Bayesian inference enable scalable and robust solutions for stochastic transport challenges.
Stochastic optimal transport (SOT) is a framework that extends classical optimal transport to settings in which transport is governed by stochastic dynamics, stochastic controls, or random costs, or where the optimal maps themselves are necessarily stochastic. This encompasses variational problems where the cost or constraints are exposed to stochasticity—either due to randomness in the system, noise in the input distributions, or uncertainty in the cost function. SOT theory unifies, extends, and motivates developments in control, probability, and optimization, and provides analytical and computational tools for the study of randomness-aware mass transport, mean-square stability of stochastic systems, stochastic feedback control, continuous-time transport under diffusion or jump processes, and modern statistical estimation of stochastic transport maps.
1. Stochastic Optimal Transport: Major Problem Classes and Definitions
SOT encompasses several formalizations, each generalizing the deterministic Monge-Kantorovich framework:
- Stochastic dynamics transport: The primal problem is formulated over path laws of continuous semimartingales , transporting an initial law to a terminal law via SDEs or Markov processes featuring control (drift) and diffusion terms. The goal is to minimize a cost functional, typically an integral over drift and possibly diffusion, possibly with state/control constraints (Tan et al., 2013, Mikami, 2023, Mikami, 2020).
- Stochastic feedback control: Transport is governed by time-varying stochastic feedback laws (kernels) mapping each state to distributions over control inputs. Here, the law of the process evolves under measured-valued (Markov) feedback, minimizing a cumulative or running cost over the probability measures on the state space (Elamvazhuthi et al., 2018).
- Stochastic cost or mapping: The cost function , or the optimal transport plan, is itself random. The SOT seeks to minimize expected cost, or infer the distribution of the transport plan (e.g., via Bayesian inference with stochastic cost data) (Mallasto et al., 2020).
- Relaxed coupling and stochastic maps: In circumstances where deterministic Monge maps do not exist (e.g., support dimension mismatch or intrinsic branching), the transport is realized by a measurable Markov kernel —a stochastic map that splits mass. The evaluation of transport quality demands metrics encompassing both excess cost and mismatch in marginals (Nietert et al., 10 Dec 2025).
- Mean-square stability via Wasserstein: SOT is used to analyze the contraction of PDFs under stochastic jump-linear systems, where mean-square stability is equivalently characterized by convergence in the -Wasserstein distance (Lee et al., 2014).
- Variational formulations and viscosity solutions: The value function for SOT—minimizing expected dynamic transport cost for stochastic control problems—satisfies a Hamilton-Jacobi-Bellman (HJB) equation on the space of probability measures, with viscosity solution theory underpinning uniqueness and comparison principles (Bertucci, 2023).
2. Duality, Variational Principles, and Connection to PDEs
A central organizing principle of SOT is the extension of Kantorovich duality to stochastic process spaces:
- Dual characterization: For convex running costs, the stochastic OT value admits a dual in the spirit of Kantorovich, formulated via solutions of a backward HJB PDE. If is the running cost and its Hamiltonian, the dual is (Tan et al., 2013, Mikami, 2020):
where solves with terminal data .
- Variational representations via Fokker-Planck equations: Stochastic transport cost can be written as an Eulerian variational principle over pairs (diffusion, drift) satisfying the Fokker-Planck PDE with prescribed marginal flow, minimizing (Mikami, 2020).
- Superposition Principle: Given a Fokker-Planck solution, one can construct an admissible semimartingale process (the Nelson process) with the prescribed marginals and quadratic energy minimization property (Mikami, 2020, Mikami, 2023).
- Viscosity solution framework: For value functions on the space of measures, the stochastic HJB equation is
where is the lifted Hamiltonian. Existence, uniqueness, and comparison principles for viscosity solutions are established under mild continuity and growth conditions (Bertucci, 2023).
3. Discrete, Semi-Discrete, and Continuous-Time Stochastic OT
The SOT theory supports a spectrum of settings:
- Discrete/Semi-discrete OT with stochastic or entropic regularization: Entropy regularization yields strictly convex problems with unique solutions and enables scalable stochastic optimization (e.g., Sinkhorn, stochastic gradient descent, Gauss-Newton, and variance-reduced primal-dual algorithms) (Abid et al., 2018, Bercu et al., 2021, Xie et al., 2022, Bercu et al., 2023, Aude et al., 2016).
- Optimal transport over stochastic feedback laws: For discrete-time nonlinear systems, the SOT problem can be formulated and solved as an infinite-dimensional linear program over occupation measures, with existence theorems leveraging stochastic controllability (Elamvazhuthi et al., 2018).
- Continuous-time SDE-driven SOT: In continuous time, the transport cost is minimized over controlled SDEs or semimartingales, possibly including drift and controlled diffusion. The dual problem involves backward HJB (Bellman) PDEs, and solution methods involve dynamic programming and explicit finite-difference schemes (Tan et al., 2013).
4. Metric Structure, Stability, and Error Quantification for Stochastic Maps
Direct estimation of stochastic transport maps (kernels) requires metrics that generalize distance:
- error metric: For a kernel , the metric
captures both cost excess and marginal mismatch, subsuming deterministic map error when an OT map exists, and quantifying risk for stochastic map estimation (Nietert et al., 10 Dec 2025).
- Statistical rates and smoothness: Near minimax-optimal finite-sample risk bounds can be realized for stochastic map estimators, with rates depending on dimension, regularity, and sample contamination; stability properties enable robustness to distributional perturbations (Nietert et al., 10 Dec 2025).
5. Numerical Methods, Stochastic Optimization, and Algorithmic Advances
Numerical solution of SOT problems capitalizes on stochastic optimization strategies tailored to the stochastic nature of the objectives:
- Stochastic gradient, variance reduction, and Gauss-Newton methods: For entropically regularized discrete or semi-discrete OT, stochastic variants of Sinkhorn, Gauss-Newton, and variance-reduced primal-dual algorithms offer computational guarantees, improved scaling, and adaptivity to curvature (Abid et al., 2018, Bercu et al., 2021, Xie et al., 2022).
- Fourier and RKHS-based algorithms: For continuous and high-dimensional problems, parameterization over Fourier bases or kernel spaces enables efficient stochastic updates that are accelerated by fast transforms and enable solution of Banach-space-valued dual problems (Bercu et al., 2023).
- Linear-programming and Eulerian formulations: SOT over controlled (nonlinear) systems is reformulated as infinite-dimensional LP over measure flows, which can be discretized for low-dimensional systems and solved using standard convex optimization algorithms (Elamvazhuthi et al., 2018).
- Bayesian inference and MCMC: Stochastic cost settings motivate posterior inference for the transport plan via Hamiltonian Monte Carlo, yielding, in principle, the full posterior distribution over OT couplings conditioned on observed randomness (Mallasto et al., 2020).
6. Applications and Broader Implications
SOT has broad impact across several fields:
- Stochastic control and mean-square stability: Wasserstein-based techniques unify mean-square stability conditions for jump-linear systems without restriction to Markovian or i.i.d. switching, with easy recovery of classical criteria through Kronecker algebra (Lee et al., 2014).
- Optimal feedback transport: SOT characterizes and realizes optimal steering of probability measures over nonlinear dynamical systems, under both deterministic and stochastic control inputs (Elamvazhuthi et al., 2018).
- Thermodynamics and entropy production: Stochastic optimal transport underlies the computation of minimal expected entropy production or work in thermodynamic transitions in Langevin systems, linking stochastic thermodynamics with classical (deterministic) OT via dynamic programming and the Burgers equation (Bo et al., 2013, Aurell et al., 2010).
- Quantile estimation and uncertainty quantification: SOT frameworks support the construction of regularized quantile maps and robust OT map estimators applicable in high-dimensional statistics, generative modeling, and adversarially contaminated data settings (Bercu et al., 2023, Nietert et al., 10 Dec 2025).
- Hamilton-Jacobi equations on measure spaces: By establishing well-posed HJB equations for SOT value functions on probability spaces, SOT builds nontrivial connections with mean-field games, stochastic PDEs on measure spaces, and the broader viscosity solution theory (Bertucci, 2023).
7. Classical Limits, Asymptotics, and Extensions
SOT recovers standard OT and Schrödinger bridge problems in suitable limits:
- Zero-noise/zero-mass limits: SOT with vanishing Brownian or random perturbation, or zero mass in Langevin settings, converges in the -limit sense to deterministic Monge-Kantorovich transport (Mikami, 2023, Mikami, 15 Oct 2025).
- Entropic regularization and large deviation principles: The Schrödinger bridge—a central object in SOT—interpolates between stochastic and classic OT as the regularization, or "noise," parameter is tuned; dual and displacement convexity properties are established for these interpolated functionals (Mikami, 2023, Mikami, 2020).
- Free endpoint and stopping time extensions: The SOT paradigm accommodates random terminal horizons (free end-time), leading to duality/variational principles involving HJB inequalities with "barrier" hitting times, applicable in optimal stopping and variational inequality frameworks (Dweik et al., 2019).
SOT represents a versatile and rapidly evolving toolset connecting probability, analysis, control, and computation, unifying multiple paradigms under the lens of transport optimizations subject to random and stochastic structural features. Recent progress encompasses both theoretical characterization—via extended dualities, PDE theory on spaces of measures, and variational representations—and algorithmic innovations enabling practical estimation and control under stochastic dynamics.
Principal references: (Lee et al., 2014, Elamvazhuthi et al., 2018, Abid et al., 2018, Tan et al., 2013, Bo et al., 2013, Xie et al., 2022, Mikami, 15 Oct 2025, Mikami, 2023, Mikami, 2020, Dweik et al., 2019, Mallasto et al., 2020, Ballu et al., 2020, Aurell et al., 2010, Nietert et al., 10 Dec 2025, Bertucci, 2023, Bercu et al., 2021, Bercu et al., 2023, Aude et al., 2016, Pollard et al., 2021).