Probability Flow ODE in Generative Modeling

Updated 31 January 2026

Probability Flow ODE is a deterministic evolution equation that reproduces the marginal laws of stochastic differential equations by evolving sample trajectories via a method-of-characteristics approach.
It employs score learning and instantaneous matching losses to approximate the log-density gradient, facilitating accurate, high-dimensional density estimation.
PF-ODE methods underpin state-of-the-art generative models and image restoration applications by offering improved computational efficiency, convergence guarantees, and invertible sample paths.

A probability flow ordinary differential equation (PF-ODE) is a deterministic evolution equation for sample trajectories whose induced pushforward of the initial distribution exactly matches the evolving solution of a corresponding Fokker–Planck equation or, equivalently, the marginal law of certain stochastic differential equations (SDEs). The PF-ODE framework provides an alternative to stochastic simulation for high-dimensional dynamical systems, score-based generative models, and diffusion-based Bayesian inference, yielding superior access to the evolving probability density, current, and entropy, as well as improved computational properties in some regimes (Boffi et al., 2022, Chen et al., 2023, Wang et al., 2024, Na et al., 13 Mar 2025, Delft et al., 2024).

1. Mathematical Definition and Derivation

Consider an Itô diffusion

$dx_t = f(x_t, t)\,dt + \sigma(x_t, t)\,dW_t,$

with drift $f$ and diffusion tensor $D = \sigma\sigma^\top$ . The marginal density $\rho_t^*(x)$ evolves according to the Fokker–Planck equation: $\partial_t \rho_t^*(x) = -\nabla\cdot J_t(x), \quad J_t(x) = f(x,t)\rho_t^*(x) - D(x,t)\nabla \rho_t^*(x).$ Dividing $J_t(x)$ by $\rho_t^*(x)$ yields the "score" (log-density gradient) representation of the velocity: $v^*_t(x) = f(x,t) - D(x,t)\,\nabla_x \log \rho_t^*(x).$ The PF-ODE is the method-of-characteristics solution to the continuity equation: $\frac{dX_t}{dt} = v^*_t(X_t), \quad X_0 \sim \rho_0.$ The pushforward distribution $\rho_t$ remains consistent with the marginal law of the original SDE (Boffi et al., 2022).

Analogous PF-ODEs arise in reverse-time SDE formulations, notably in generative models: Given a forward SDE of the form $dx_t = f(x_t,t)dt + g(t)dW_t$ with target marginal $q_t$ , the probability flow ODE is

$dx_t = \left[f(x_t, t) - \frac{1}{2}g(t)^2 \nabla_x \log q_t(x_t)\right]dt,$

where the stochastic noise is removed and the drift adjusted accordingly (Chen et al., 2023, Delft et al., 2024, Wang et al., 2024).

2. Score Learning and Practical Approximation

In non-trivial high-dimensional settings, the score term $\nabla_x \log \rho_t^*(x)$ (or the analogous conditional scores in bridge models) is not available in closed form. Instead, one trains a neural network $s_t(x;\theta_t)$ to approximate the score. This is typically accomplished via instantaneous (Hyvärinen or denoising) score-matching losses, matched to the current pushforward of the propagating sample set:

Sequential Score Matching (SSBTM):

$L_t[\theta] = \mathbb{E}_{x\sim\rho_0}\left[ |s_t(X_t(x))|^2_{D(X_t(x))} + 2\nabla\cdot(D(X_t(x)) s_t(X_t(x))) \right].$

Denoising Proxy:

$L_t[\theta] \approx \mathbb{E}_{x\sim\rho_0,\,\xi\sim\mathcal{N}(0,I)}\left[ |s_t(X_t(x)+\alpha\sigma\xi) + \xi/\alpha|^2 \right].$

Score updates and ODE integration are interleaved, resulting in feedback-driven learning aligned to the current sample distribution (Boffi et al., 2022, Wang et al., 2024, Na et al., 13 Mar 2025).

For conditional generation (e.g., diffusion bridges), score estimation typically leverages a predictor function $D_\theta$ trained to minimize the denoising loss, from which the score is analytically recovered for the PF-ODE update (Wang et al., 2024).

3. Theoretical Guarantees and Convergence Analysis

The PF-ODE, when used with online-trained accurate scores, controls the Kullback–Leibler (KL) divergence between the learned pushforward and the true target law. Specifically: $\frac{d}{dt}\mathrm{KL}(\rho_t\|\rho^*_t) \leq \frac{1}{2} \int | s_t(x)-\nabla\log\rho_t(x) |^2_{D(x,t)} \rho_t(x)\,dx,$ so uniform score accuracy ensures the pushforward remains close to the true solution (Boffi et al., 2022).

In the context of score-based generative models, polynomial-time convergence guarantees have been established under global Lipschitz assumptions:

With underdamped Langevin corrector steps, the PF-ODE attains a total variation distance $\leq \epsilon$ to the data distribution in $\widetilde{O}(L^2\sqrt{d}/\epsilon)$ steps, compared to $\widetilde{O}(d/\epsilon^2)$ for DDPM/VE SDE approaches (Chen et al., 2023).
This dimension dependence improvement ( $O(\sqrt{d})$ vs. $O(d)$ ) arises from the higher regularity of ODE trajectories and the reset/regularization effect of injective corrector bursts.

In infinite-dimensional Hilbert spaces, Fomin-differentiability and integration-by-parts formulations of the score permit analogous guarantee arguments for PF-ODE pushforwards (Na et al., 13 Mar 2025).

4. Algorithmic Implementations and Solvers

The practical implementation of PF-ODE-based sampling involves the following routine:

Initialize a batch of $N$ samples from the data or noise prior.
For each time step $t_k$ , perform:
- Score update: Minimize the instantaneous score-matching loss using the current batch of samples.
- ODE integration: Advance each sample by an explicit step of the PF-ODE using the estimated score.
- Repeat until final time $T$ .

For accelerated inference, higher-order solvers (e.g., Heun/Runge–Kutta) are used. In the bridge setting, a stochastic start is required to avoid singularities at $t=T$ : the initial point is sampled from the bridge posterior, after which the PF-ODE is integrated backward using a second-order solver (Wang et al., 2024).

A schematic of the ODE sampler with stochastic start:

Step	Operation
Stochastic Start	Sample initial $X_\tau$ from bridge posterior
Predictor (ODE)	Advance via Heun/Euler PF-ODE step
Corrector	Heun's corrected drift using fresh score evaluations
Final Output	Return $X_0$ as sample from the target distribution

ODE inversion enables deterministic bidirectional mapping (up to solver error), supporting applications in editing, restoration, and data likelihood maximization (Delft et al., 2024).

5. Applications and Empirical Results

Score-based generative modeling: PF-ODE frameworks have demonstrated polynomial-time convergence rates, dimension-efficient complexity scaling, and empirical benefits for sample quality at reduced neural function evaluations (NFEs) (Chen et al., 2023, Wang et al., 2024, Na et al., 13 Mar 2025).

Diffusion bridges and image restoration/translation: By introducing a stochastic start and using second-order solvers for the PF-ODE, state-of-the-art FID is achieved on tasks such as image super-resolution, JPEG restoration, and conditional translation (e.g., Edges $\to$ Handbags), with 2.7 $\times$ –4 $\times$ speedup in NFEs compared to SDE-based methods (Wang et al., 2024).

Blind image restoration and editing: The CODE framework exploits PF-ODE invertibility and incorporates confidence interval clipping (CBC) to robustly handle out-of-distribution (OoD) noisy conditionings. Quantitatively, CODE attains lower FID, higher PSNR, and improved source similarity compared to SDE-based solutions. CBC bounds the trajectory within in-distribution confidence intervals, stabilizing severe restorations (Delft et al., 2024).

Function-space generative modeling: The PF-ODE in infinite dimensions enables deterministic sampling of random fields (e.g., PDE solutions) on Hilbert spaces, significantly reducing computational costs relative to SDE-based approaches while preserving quality metrics such as sliced-Wasserstein distance and $L^2$ error (Na et al., 13 Mar 2025).

6. Extensions, Limitations, and Open Directions

PF-ODEs have been extended to:

Infinite-dimensional function spaces: Adapting the score operator to Hilbert–Schmidt structure, permitting application to PDEs and spatial field inference (Na et al., 13 Mar 2025).
Conditional and bridge models: Incorporating analytically computed posterior sampling at singular endpoints and implementing plug-and-play compatibility with pretrained diffusion bridge architectures (Wang et al., 2024).

Limitations and research directions include:

Need for accurate, globally Lipschitz-constrained score approximators for theoretical complexity guarantees (Chen et al., 2023).
Cost and necessity of underdamped Langevin correctors in practice.
Analysis of discretization error in both finite- and infinite-dimensional regimes as the time discretization and function basis are refined.
Extension to manifold-valued and vector-valued data, as well as incorporation of higher-order solvers and knowledge distillation schemes for accelerated sampling (Na et al., 13 Mar 2025).

7. Comparative Summary

Approach	PF-ODE	SDE-Based (DDPM, etc.)
Dynamics	Deterministic	Stochastic
Marginal Law Matching	Exact (up to solver)	Exact (by construction)
Sample Path	Invertible	Non-invertible
Complexity (dim d)	$\tilde O(\sqrt{d})$	$\tilde O(d)$
Restoration Applications	Supports editing	Entangles noise/fidelity
Empirical NFE	15–38 (e.g. ODES³)	100–118 (state-of-the-art)

The PF-ODE family of methods has become a central technique in modern generative modeling, high-dimensional probabilistic inference, and conditional inverse problems, offering deterministic, scalable, and theoretically principled alternatives to stochastic simulation frameworks (Boffi et al., 2022, Chen et al., 2023, Wang et al., 2024, Na et al., 13 Mar 2025, Delft et al., 2024).