Linear Stochastic Interpolants: Theory & Applications

Updated 17 January 2026

Linear stochastic interpolants are time-parametrized probability densities that bridge two distributions via linear mixing and optional additive noise.
They offer explicit drift and score field characterizations with closed-form marginal laws, supporting robust and efficient algorithmic implementations.
Their framework underpins advances in generative modeling, high-dimensional sampling, physical emulation, multitask generation, and derivative‐free optimization with strong theoretical guarantees.

Linear stochastic interpolants define time-parametrized families of probability densities that bridge two prescribed distributions, typically by linearly mixing independent samples and, optionally, additive noise. This paradigm provides a unified theoretical foundation for flow-based, diffusion-based, and score-driven generative modeling, high-dimensional sampling, physical system emulation, multitask generative architectures, and derivative-free optimization. Linear stochastic interpolants admit closed-form marginal laws, explicit characterizations of associated drift and score fields, and support efficient algorithmic implementations with provable statistical guarantees and robust empirical performance.

1. Mathematical Foundations and Definitions

Let $x_0 \sim \mu_0$ , $x_1 \sim \mu_1$ be independent (or suitably coupled) random variables on $\mathbb{R}^d$ . A linear stochastic interpolant is defined via

$x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$

where $\alpha, \beta, \gamma$ are schedule functions, typically smooth on $[0,1]$ , satisfying boundary conditions $\alpha(0)=1$ , $\beta(0)=0$ , $\gamma(0)=0$ and $\alpha(1)=0$ , $x_1 \sim \mu_1$ 0, $x_1 \sim \mu_1$ 1 (Albergo et al., 2023, Zhou et al., 30 Sep 2025). Special cases include:

Full interpolant: $x_1 \sim \mu_1$ 2, $x_1 \sim \mu_1$ 3, no explicit noise.
Noise-augmented interpolant: $x_1 \sim \mu_1$ 4 in $x_1 \sim \mu_1$ 5 allows flexible bridging.

The law of $x_1 \sim \mu_1$ 6 is given by convolution of endpoint distributions and noise:

$x_1 \sim \mu_1$ 7

For $x_1 \sim \mu_1$ 8, $x_1 \sim \mu_1$ 9, this admits a closed-form Gaussian marginal with mean and covariance:

$\mathbb{R}^d$ 0

(Albergo et al., 2023, George et al., 1 Feb 2025).

2. Probability-Flow, SDE Representation, and Score Field

Linear stochastic interpolants generate deterministic (ODE) or stochastic (SDE) flows whose marginals coincide with the interpolant law for each $\mathbb{R}^d$ 1.

Probability-flow ODE and Fokker–Planck PDE

For density $\mathbb{R}^d$ 2, the probability-flow ODE is governed by a drift $\mathbb{R}^d$ 3 such that $\mathbb{R}^d$ 4; for SDE, additional noise $\mathbb{R}^d$ 5 is injected:

$\mathbb{R}^d$ 6

with

$\mathbb{R}^d$ 7

and score

$\mathbb{R}^d$ 8

(Albergo et al., 2023).

For more general (non-Gaussian) endpoints, the drift and score satisfy:

$\mathbb{R}^d$ 9

and

$x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 0

(George et al., 1 Feb 2025).

Objective Minimization

Drift and score fields are solutions to mean-square regression problems:

Velocity: $x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 1 is minimized by $x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 2.
Score: $x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 3 uniquely minimized by the exact score field (Albergo et al., 2023).

3. Sampling Algorithms and Machine Learning Approximation

Sampling from high-dimensional or unnormalized target densities via linear stochastic interpolants is operationalized as follows:

FBSDE-based learning: Parameterize the value function $x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 4 (log-potential) with a neural network, train by enforcing terminal and local consistency constraints from the forward–backward SDE and backward Kolmogorov/Hamilton–Jacobi–Bellman PDE, and use autodiff for gradients. Learning objectives combine terminal-matching and local consistency terms (George et al., 1 Feb 2025).
Monte Carlo velocity estimation: For non-Gaussian targets, conditional expectations like $x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 5 are estimated online via Langevin diffusion with strong log-concavity or log-Sobolev guarantees; RMSprop-style preconditioning further stabilizes mixing on anisotropic landscapes (Duan et al., 13 Jan 2026).
Simulation-free and multitask pipelines: Operator-valued interpolant models are trained by minimizing losses for multipurpose drift functions over a probability measure on operator pairs $x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 6; once trained, specialized generation tasks (conditional sampling, inpainting, multichannel denoising, posterior sampling, planning) are implemented solely by choosing appropriate $x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 7 trajectories and sampling via Euler schemes (Negrel et al., 6 Aug 2025).

4. Theoretical Guarantees and Explicit Error Bounds

Linear stochastic interpolants admit strong regularity, uniqueness, and convergence properties:

Existence and regularity: For smooth schedule functions and full-support marginals, the interpolant densities are strictly positive and smooth over $x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 8. The SDE admits a unique strong solution; empirical regression converges uniformly for bounded Lipschitz settings (Albergo et al., 2023).
Statistical control: For SDEs with nonzero diffusion, approximate learned drift and score functions yield explicit upper bounds on Kullback–Leibler divergence to the exact target, proportional to the $x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)$ 9 error in the learned functions (Albergo et al., 2023).
Langevin convergence: Under log-Sobolev or strong log-concavity (parameter $\alpha, \beta, \gamma$ 0), the Langevin diffusion for velocity estimation converges exponentially in KL-divergence, with rates derived from functional inequalities for the conditional law $\alpha, \beta, \gamma$ 1 and the initialization marginals (Duan et al., 13 Jan 2026).
Optimization accuracy: In derivative-free optimization, gradient estimates via linear stochastic interpolation (on orthogonal directions) guarantee norm-type accuracy $\alpha, \beta, \gamma$ 2, with deterministic error scaling (Berahas et al., 2019). Gaussian smoothing requires $\alpha, \beta, \gamma$ 3 samples vs. $\alpha, \beta, \gamma$ 4 for orthogonal directions.

5. Applications: High-Dimensional Sampling, Physical Systems, Multitask Generation, Optimization

Linear stochastic interpolants have demonstrated versatility and empirical efficacy in diverse domains:

Generative modeling and sampling: Algorithms based on stochastic interpolants surpass classical MCMC and neural samplers (e.g., in negative log-likelihood, maximum mean discrepancy, Wasserstein distance) on challenging multi-modal, high-dimensional distributions (Duan et al., 13 Jan 2026, George et al., 1 Feb 2025).
Physical emulation and forecasting: Stochastic interpolant samplers exploit proximity between successive states in physics and climate systems, achieving high pointwise and spectral accuracy, uncertainty calibration (lower CRPS, nearly unity SSR), and requiring minimal sampling steps ( $\alpha, \beta, \gamma$ 5 to $\alpha, \beta, \gamma$ 6 versus $\alpha, \beta, \gamma$ 7– $\alpha, \beta, \gamma$ 8 for DDPMs) (Zhou et al., 30 Sep 2025).
Multitask generative architectures: Operator-valued interpolants span joint families of marginals, enabling zero-shot inpainting, denoising, posterior inference, and multiscale trajectory modeling without task-specific retraining (Negrel et al., 6 Aug 2025).
Optimization and RL policy search: In high-dimensional, noisy, derivative-free optimization, linear interpolation along orthonormal directions achieves superior gradient estimation accuracy and convergence speed compared to Gaussian smoothing, with robust empirical validation on synthetic benchmarks and RL tasks (Berahas et al., 2019).

6. Extensions, Limitations, and Recent Innovations

Recent research contextualizes and extends the linear stochastic interpolant paradigm:

Curved interpolant learning: Fixed linear interpolants yield curved vector fields and slow inference; optimizing the interpolant (e.g., via invertible neural networks parameterizing $\alpha, \beta, \gamma$ 9) to minimize flow curvature recovers straight flows, with ultrafast (few-step) ODE inference of competitive sample fidelity (Shankar et al., 26 Mar 2025).
Multimarginal and simplex parameterizations: The simplex generalization enables bridging and correspondence extraction across $[0,1]$ 0 marginals, with velocity and score fields derived as minimizers of quadratic objectives defined on the simplex domain (Albergo et al., 2023).
Path optimization: Wasserstein path-length minimization via Fourier-expansion parameterization of schedule functions yields more efficient interpolant trajectories (Albergo et al., 2023).

A plausible implication is that further generalization of operator-valued and nonlinear stochastic interpolants may enable even broader multitask and ensemble modeling capabilities, subject to appropriate theoretical and computational constraints.

7. Comparative Analysis and Representative Empirical Results

Empirical studies consistently support the statistical and computational advantages of linear stochastic interpolants:

Application Domain	Core Metric(s)	Linear SI Performance
Multimodal Sampling	MMD, $[0,1]$ 1, NLL	Lowest MMD, optimal $[0,1]$ 2 (SSI)
Physical Emulation	VRMSE, SRMSE, CRPS, SSR	Fewer steps, lower calibration error
Multitask Generation	PSNR, SSIM	$[0,1]$ 3 PSNR/SSIM $[0,1]$ 4 degraded
Derivative-Free Opt (RL)	Gradient error, reward, convergence speed	Fastest, smallest norm-error
Image Generation (Curved)	FID (NFE $[0,1]$ 5– $[0,1]$ 6)	Superior FID at few steps