Variational Gradient-Flow Equations

Updated 18 February 2026

Variational gradient-flow equations are defined as a framework that describes steepest descent in metric spaces using an energy functional and dissipation metrics, applicable to infinite-dimensional settings.
They employ diverse metric structures such as Wasserstein and Stein metrics, underpinning stable numerical schemes like the JKO and variational BDF2 methods.
Applications include modeling nonlinear diffusion, Bayesian and Stein variational inference, and geometric flows, highlighting their significance in modern analysis and computational mathematics.

Variational gradient-flow equations describe the evolution of states in a metric or topological space as the steepest descent of an energy (or entropy) functional with respect to an underlying geometry. The framework generalizes the classical notion of gradient flows in finite-dimensional Euclidean spaces to infinite-dimensional settings (such as probability measures) and accommodates non-Euclidean metrics, most notably Wasserstein or Stein-type metrics. This variational structure underpins the analysis, discretization, and computation of nonlinear PDEs, stochastic processes, statistical inference procedures, and geometric flows, and is foundational in modern analysis and applied mathematics.

1. Variational Gradient Flow: Foundational Principles

A variational gradient-flow equation couples three elements: (1) a state space $X$ equipped with a metric (e.g., Hilbert, Banach, or Wasserstein), (2) a lower-semicontinuous energy functional $E:X\to \mathbb{R}\cup\{+\infty\}$ , and (3) a dissipation metric or potential governing the "steepest descent" direction. The canonical form is

$\partial_t x_t = -\nabla E(x_t),$

where $\nabla$ denotes the (possibly generalized, metric) gradient.

In Wasserstein spaces, for instance, the PDE

$\partial_t \rho = \nabla\cdot\left(\rho \nabla \frac{\delta E}{\delta \rho}\right)$

is the $W_2$ -gradient flow of $E$ on the space $\mathcal{P}_2$ of probability measures with finite second moment (Gallouët et al., 2022, Erbar et al., 2024, Fan et al., 2021).

In the Hilbert space context, the classical "minimizing-movement" (JKO) scheme iteratively constructs the flow as

$x_{n+1} = \operatorname{argmin}_{x\in X}\left\{ E(x) + \frac{1}{2\tau}d^2(x,x_n)\right\}$

with time step $\tau>0$ . As $\tau\to 0$ , the piecewise-constant interpolant converges to the solution of the continuous gradient-flow equation (Duong et al., 2019, Pietschmann et al., 2022).

Stein variational gradient flows replace the traditional Euclidean or Wasserstein geometry with a reproducing-kernel Hilbert space (RKHS) metric tailored to a target distribution, leading to equations of the form

$\partial_t q_t = -\nabla\cdot\left(q_t \phi^*_{q_t,p}\right)$

with $\phi^*_{q_t,p}$ the unique RKHS-optimal direction for decreasing the Kullback–Leibler divergence in the direction permitted by the Stein identity (Liu, 2017).

2. Metric Structures and Dissipation Potentials

The choice of metric profoundly shapes both the form and properties of the gradient flow:

Wasserstein Gradient Flow: For $X = \mathcal{P}_2(\mathbb{R}^d)$ and $E(\rho)$ suitably convex (e.g., entropy, interaction energies), the steepest-descent dynamics are characterized by the $W_2$ metric. This setting enables the variational characterization of a wide range of nonlinear diffusions, aggregation-diffusion models, and mean-field equations (Yao et al., 2022, Fan et al., 2021, Kim et al., 2022).
Stein Gradient Flow: The Stein operator $\mathcal{S}_p$ and RKHS metric define a Riemannian-like geometry for probability measures, with the kernelized Stein discrepancy (KSD) $\mathbb{S}(\mu \| p)$ metrizing weak convergence. SVGD and its accelerated variants are interpreted as gradient flows in this Stein–RKHS metric (Liu, 2017, Stein et al., 30 Mar 2025).
Generalized/Non-Euclidean Metrics: In problems with boundary mass exchange (e.g., Dirichlet boundary data) or irreversible (GENERIC) evolution, the basic Wasserstein or Hilbertian metric is modified (e.g., the Figalli-Gigli $W_{b,2}$ metric) to account for nonconservativity or coupled conservative–dissipative evolution (Erbar et al., 2024, Kim et al., 2022, Lombardi et al., 2024).

The dissipation mechanism is described by a (possibly nonlinear, nonquadratic) potential function $\Psi(x,\dot x)$ or its dual $\Psi^*(x,\xi)$ , as in the generalized Rayleigh–Energie principle (Duong et al., 2015). This duality, often originating from large-deviations principles, is key in passing to various hydrodynamic or macroscopic limits.

3. Variational and Metric Characterizations

Gradient flows in metric spaces are uniquely characterized as curves of maximal slope for the energy functional $E$ w.r.t. the metric $d$ (Erbar et al., 2024, Kim et al., 2022). The definition of the descending slope, upper gradient, and the notion of "maximal slope" encapsulate the infinitesimal decay of energy and dissipation of the flow:

$|{\partial E}|(x) := \limsup_{y\to x, d(x,y) \to 0} \frac{[E(x)-E(y)]_+}{d(x,y)}.$

An absolutely continuous curve $x(\cdot)$ is a curve of maximal slope if

$E(x(t_2)) + \frac{1}{2}\int_{t_1}^{t_2}(|x'|_d^2 + |{\partial E}|^2)\,dt \leq E(x(t_1))$

for all $t_2 > t_1$ . In the Wasserstein setting, this leads to rigorous existence, uniqueness, and regularity theories for nonlinear diffusion equations with degenerate or nonstandard boundaries (Erbar et al., 2024).

For equations with time-fractional derivatives (e.g., Caputo), the gradient-flow structure persists: a fractional-time JKO-type scheme, using appropriate memory-weighted combinations, recovers the subdiffusive evolution with nonlocal in time dissipation (Duong et al., 2019).

4. Numerical Schemes and Discretizations

The variational structure provides a unifying principle for the derivation of stable and accurate numerical schemes:

JKO schemes: Implicit-in-time "minimizing-movement" schemes are unconditionally energy-stable, preserve the gradient-flow structure, and provide existence and approximation results for a broad class of PDEs (Kinderlehrer et al., 2015, Pietschmann et al., 2022).
Variational BDF2 and high-order schemes: Variational analogues of multi-step schemes, such as Backward Differentiation Formula of order two (BDF2), introduce variational extrapolations in Wasserstein space and achieve high-order time accuracy while retaining unconditional structure preservation (Gallouët et al., 2022, 1908.10246).
Finite-volume and structure-preserving spatial discretizations: Finite-volume schemes, often with TPFA flux approximations, preserve positivity, mass conservation, convexity, and energy dissipation at the discrete level (Cancès et al., 2019).
Function-approximation-based JKO solvers: Recent advances exploit neural networks to parameterize optimal transport maps in high-dimensional settings, enabling scalable and stable algorithms for variational inference and PDE solution (Fan et al., 2021, Yao et al., 2022).

A comparison of key discretization approaches is given below:

Scheme Type	Time Accuracy	Spatial Discretization	Structure Preserved
Classic JKO	1st order	Any	Monotonicity, dissipation
Variational BDF2	2nd order	FV / mesh-based	Energy, gradient-flow
Primal–dual (NN)	N/A	Sample-based	Variational duality

5. Applications: PDEs, Statistical Inference, and Beyond

Variational gradient-flow structures underpin diverse evolutions:

Nonlinear diffusion and drift-diffusion: The PME, Fokker–Planck, aggregation–diffusion, and chemotaxis equations are all realized as $W_2$ -gradient flows of suitable energies. Extensions to Dirichlet, Neumann, or fractional dynamics leverage modified metrics or JKO schemes (Kim et al., 2022, Erbar et al., 2024, Duong et al., 2019).
Mean-field variational inference: The evolution of variational approximations for Bayesian posteriors, including mean-field and latent-variable models, can be characterized as Wasserstein gradient flows of the KL divergence, yielding contraction properties and scalable algorithms for posterior contraction (Yao et al., 2022).
Stein variational inference: SVGD, its accelerated variants, and the continuous-time Stein gradient-flow PDE regime provide deterministic sampling and optimization incorporating function-space metrics tied to the target distribution, essential for non-reversible and high-dimensional inference (Liu, 2017, Stein et al., 30 Mar 2025).
Geometry and geometric flows: Variational gradient flows describe the evolution of geometric objects, such as the $L^2$ -gradient flow of two-phase biomembrane energies, with weak and sharp-interface formulations retained in discretizations (Barrett et al., 2017).
Irreversible and Hamiltonian-dissipative systems: Infinite-dimensional systems coupling symplectic and dissipative dynamics (GENERIC, metriplectic) admit mixed variational discretizations preserving both invariants and monotonicity in the discrete evolution (Lombardi et al., 2024).

6. Extensions: Flux Formulations and Coarse-Graining

Beyond detailed-balance and purely dissipative systems, the unified variational–gradient-flux framework encodes both dissipative and non-dissipative (Hamiltonian) behavior via convex flux–density Lagrangians arising from large-deviations or macroscopic fluctuation theory (Patterson et al., 2021, Duong et al., 2015). The decomposition into symmetric (gradient-flow) and antisymmetric (Hamiltonian/cyclic) components, and the resulting generalized orthogonality, allows variational handling of zero-range processes, reaction networks, and driven diffusions.

For systems with boundaries, non-homogeneous or time-varying constraints, Wasserstein-type metrics can be adapted to accommodate mass exchange with the environment, ensuring the variational structure remains intact (Kim et al., 2022, Erbar et al., 2024).

Coarse-graining and hydrodynamic limit processes—such as overdamped Vlasov-Fokker–Planck or small-noise Hamiltonian systems—are rigorously captured via dual variational representations and energy-dissipation identities inherited from the particle-system large-deviations structure (Duong et al., 2015).

7. Current Developments and Perspectives

Recent progress includes the design of accelerated gradient flows on metric spaces (analogous to Nesterov’s acceleration), hybrid variational–Hamiltonian discretizations, and neural algorithmic techniques for high-dimensional flows (Stein et al., 30 Mar 2025, Lombardi et al., 2024, Fan et al., 2021). Ongoing challenges concern rigorous error analysis of non-Euclidean schemes, adaptive and learning-based variational discretizations, and further generalization to nonsmooth, non-convex, or infinite-dimensional settings.

The variational gradient-flow paradigm thus continues to serve as a robust, unifying principle for both the analysis and algorithmic implementation of nonlinear dynamical systems across analysis, geometry, probability, and computational mathematics.