Papers
Topics
Authors
Recent
Search
2000 character limit reached

Linear Stochastic Interpolants: Theory & Applications

Updated 17 January 2026
  • Linear stochastic interpolants are time-parametrized probability densities that bridge two distributions via linear mixing and optional additive noise.
  • They offer explicit drift and score field characterizations with closed-form marginal laws, supporting robust and efficient algorithmic implementations.
  • Their framework underpins advances in generative modeling, high-dimensional sampling, physical emulation, multitask generation, and derivative‐free optimization with strong theoretical guarantees.

Linear stochastic interpolants define time-parametrized families of probability densities that bridge two prescribed distributions, typically by linearly mixing independent samples and, optionally, additive noise. This paradigm provides a unified theoretical foundation for flow-based, diffusion-based, and score-driven generative modeling, high-dimensional sampling, physical system emulation, multitask generative architectures, and derivative-free optimization. Linear stochastic interpolants admit closed-form marginal laws, explicit characterizations of associated drift and score fields, and support efficient algorithmic implementations with provable statistical guarantees and robust empirical performance.

1. Mathematical Foundations and Definitions

Let x0μ0x_0 \sim \mu_0, x1μ1x_1 \sim \mu_1 be independent (or suitably coupled) random variables on Rd\mathbb{R}^d. A linear stochastic interpolant is defined via

xt=α(t)x0+β(t)x1+γ(t)z,zN(0,I)x_t = \alpha(t)x_0 + \beta(t)x_1 + \gamma(t) z, \quad z \sim \mathcal{N}(0,I)

where α,β,γ\alpha, \beta, \gamma are schedule functions, typically smooth on [0,1][0,1], satisfying boundary conditions α(0)=1\alpha(0)=1, β(0)=0\beta(0)=0, γ(0)=0\gamma(0)=0 and α(1)=0\alpha(1)=0, β(1)=1\beta(1)=1, γ(1)=0\gamma(1)=0 (Albergo et al., 2023, Zhou et al., 30 Sep 2025). Special cases include:

  • Full interpolant: α(0)=1\alpha(0)=1, β(1)=1\beta(1)=1, no explicit noise.
  • Noise-augmented interpolant: γ(t)>0\gamma(t)>0 in $0 < t < 1$ allows flexible bridging.

The law of xtx_t is given by convolution of endpoint distributions and noise:

pt(x)=Rdμ1(dx)  N(x;α(t)x,γ2(t)I)p_t(x) = \int_{\mathbb{R}^d} \mu_1(dx^*) \; \mathcal{N}\bigl(x; \alpha(t)x^*, \gamma^2(t)I\bigr)

For μ0=N(m0,C0)\mu_0=\mathcal{N}(m_0,C_0), μ1=N(m1,C1)\mu_1=\mathcal{N}(m_1,C_1), this admits a closed-form Gaussian marginal with mean and covariance:

E[xt]=α(t)m0+β(t)m1,Cov(xt)=α2(t)C0+β2(t)C1+γ2(t)I\mathbb{E}[x_t] = \alpha(t)m_0 + \beta(t)m_1, \qquad \mathrm{Cov}(x_t) = \alpha^2(t)C_0 + \beta^2(t)C_1 + \gamma^2(t)I

(Albergo et al., 2023, George et al., 1 Feb 2025).

2. Probability-Flow, SDE Representation, and Score Field

Linear stochastic interpolants generate deterministic (ODE) or stochastic (SDE) flows whose marginals coincide with the interpolant law for each tt.

Probability-flow ODE and Fokker–Planck PDE

For density pt(x)p_t(x), the probability-flow ODE is governed by a drift b(t,x)b(t,x) such that dXt=b(t,Xt)dtdX_t = b(t,X_t)\,dt; for SDE, additional noise σ(t)dWt\sigma(t)dW_t is injected:

dXt=bF(t,Xt)dt+2ϵ(t)dWtdX_t = b_F(t,X_t)\,dt + \sqrt{2\epsilon(t)}\,dW_t

with

bF(t,x)=m˙(t)+(12C˙(t)ϵ(t)I)C(t)1(xm(t))b_F(t,x) = \dot{m}(t) + \left(\frac{1}{2} \dot{C}(t) - \epsilon(t)I\right)C(t)^{-1}(x - m(t))

and score

s(t,x)=xlogpt(x)=C(t)1(xm(t))s(t,x) = \nabla_x \log p_t(x) = -C(t)^{-1}(x - m(t))

(Albergo et al., 2023).

For more general (non-Gaussian) endpoints, the drift and score satisfy:

b(t,x)=α˙(t)E[x0xt=x]+β˙(t)E[x1xt=x]γ(t)γ˙(t)s(t,x)b(t,x) = \dot{\alpha}(t)\mathbb{E}[x_0|x_t=x] + \dot{\beta}(t)\mathbb{E}[x_1|x_t=x] - \gamma(t)\dot{\gamma}(t)s(t,x)

and

s(t,x)=α(t)E[x0xt=x]+β(t)E[x1xt=x]xγ2(t)s(t,x) = \frac{\alpha(t)\mathbb{E}[x_0|x_t=x] + \beta(t)\mathbb{E}[x_1|x_t=x] - x}{\gamma^2(t)}

(George et al., 1 Feb 2025).

Objective Minimization

Drift and score fields are solutions to mean-square regression problems:

  • Velocity: 01E[v^(xt,t)x˙t2]dt\int_0^1 \mathbb{E}[|\hat{v}(x_t,t) - \dot{x}_t|^2] dt is minimized by v(x,t)=E[x˙txt=x]v(x,t) = \mathbb{E}[\dot{x}_t | x_t=x].
  • Score: 01E[s^(xt,t)xlogpt(xt)2]dt\int_0^1 \mathbb{E}[|\hat{s}(x_t,t) - \nabla_x \log p_t(x_t)|^2] dt uniquely minimized by the exact score field (Albergo et al., 2023).

3. Sampling Algorithms and Machine Learning Approximation

Sampling from high-dimensional or unnormalized target densities via linear stochastic interpolants is operationalized as follows:

  • FBSDE-based learning: Parameterize the value function u(t,x)u(t,x) (log-potential) with a neural network, train by enforcing terminal and local consistency constraints from the forward–backward SDE and backward Kolmogorov/Hamilton–Jacobi–Bellman PDE, and use autodiff for gradients. Learning objectives combine terminal-matching and local consistency terms (George et al., 1 Feb 2025).
  • Monte Carlo velocity estimation: For non-Gaussian targets, conditional expectations like E[x1xt=x]\mathbb{E}[x_1|x_t=x] are estimated online via Langevin diffusion with strong log-concavity or log-Sobolev guarantees; RMSprop-style preconditioning further stabilizes mixing on anisotropic landscapes (Duan et al., 13 Jan 2026).
  • Simulation-free and multitask pipelines: Operator-valued interpolant models are trained by minimizing losses for multipurpose drift functions over a probability measure on operator pairs (α,β)(\alpha,\beta); once trained, specialized generation tasks (conditional sampling, inpainting, multichannel denoising, posterior sampling, planning) are implemented solely by choosing appropriate (αt,βt)(\alpha_t, \beta_t) trajectories and sampling via Euler schemes (Negrel et al., 6 Aug 2025).

4. Theoretical Guarantees and Explicit Error Bounds

Linear stochastic interpolants admit strong regularity, uniqueness, and convergence properties:

  • Existence and regularity: For smooth schedule functions and full-support marginals, the interpolant densities are strictly positive and smooth over tt. The SDE admits a unique strong solution; empirical regression converges uniformly for bounded Lipschitz settings (Albergo et al., 2023).
  • Statistical control: For SDEs with nonzero diffusion, approximate learned drift and score functions yield explicit upper bounds on Kullback–Leibler divergence to the exact target, proportional to the L2L^2 error in the learned functions (Albergo et al., 2023).
  • Langevin convergence: Under log-Sobolev or strong log-concavity (parameter βt\beta_t), the Langevin diffusion for velocity estimation converges exponentially in KL-divergence, with rates derived from functional inequalities for the conditional law pX1Xt=xp_{X_1|X_t=x} and the initialization marginals (Duan et al., 13 Jan 2026).
  • Optimization accuracy: In derivative-free optimization, gradient estimates via linear stochastic interpolation (on orthogonal directions) guarantee norm-type accuracy g(x)ϕ(x)θϕ(x)\|g(x)-\nabla \phi(x)\| \leq \theta \|\nabla \phi(x)\|, with deterministic error scaling (Berahas et al., 2019). Gaussian smoothing requires Ω(n/θ2)\Omega(n/\theta^2) samples vs. nn for orthogonal directions.

5. Applications: High-Dimensional Sampling, Physical Systems, Multitask Generation, Optimization

Linear stochastic interpolants have demonstrated versatility and empirical efficacy in diverse domains:

  • Generative modeling and sampling: Algorithms based on stochastic interpolants surpass classical MCMC and neural samplers (e.g., in negative log-likelihood, maximum mean discrepancy, Wasserstein distance) on challenging multi-modal, high-dimensional distributions (Duan et al., 13 Jan 2026, George et al., 1 Feb 2025).
  • Physical emulation and forecasting: Stochastic interpolant samplers exploit proximity between successive states in physics and climate systems, achieving high pointwise and spectral accuracy, uncertainty calibration (lower CRPS, nearly unity SSR), and requiring minimal sampling steps (N=2N=2 to N=5N=5 versus $50$–$100$ for DDPMs) (Zhou et al., 30 Sep 2025).
  • Multitask generative architectures: Operator-valued interpolants span joint families of marginals, enabling zero-shot inpainting, denoising, posterior inference, and multiscale trajectory modeling without task-specific retraining (Negrel et al., 6 Aug 2025).
  • Optimization and RL policy search: In high-dimensional, noisy, derivative-free optimization, linear interpolation along orthonormal directions achieves superior gradient estimation accuracy and convergence speed compared to Gaussian smoothing, with robust empirical validation on synthetic benchmarks and RL tasks (Berahas et al., 2019).

6. Extensions, Limitations, and Recent Innovations

Recent research contextualizes and extends the linear stochastic interpolant paradigm:

  • Curved interpolant learning: Fixed linear interpolants yield curved vector fields and slow inference; optimizing the interpolant (e.g., via invertible neural networks parameterizing φt,x1(x0)\varphi_{t,x_1}(x_0)) to minimize flow curvature recovers straight flows, with ultrafast (few-step) ODE inference of competitive sample fidelity (Shankar et al., 26 Mar 2025).
  • Multimarginal and simplex parameterizations: The simplex generalization enables bridging and correspondence extraction across K>2K > 2 marginals, with velocity and score fields derived as minimizers of quadratic objectives defined on the simplex domain (Albergo et al., 2023).
  • Path optimization: Wasserstein path-length minimization via Fourier-expansion parameterization of schedule functions yields more efficient interpolant trajectories (Albergo et al., 2023).

A plausible implication is that further generalization of operator-valued and nonlinear stochastic interpolants may enable even broader multitask and ensemble modeling capabilities, subject to appropriate theoretical and computational constraints.

7. Comparative Analysis and Representative Empirical Results

Empirical studies consistently support the statistical and computational advantages of linear stochastic interpolants:

Application Domain Core Metric(s) Linear SI Performance
Multimodal Sampling MMD, W2W_2, NLL Lowest MMD, optimal W2W_2 (SSI)
Physical Emulation VRMSE, SRMSE, CRPS, SSR Fewer steps, lower calibration error
Multitask Generation PSNR, SSIM Δ\Delta PSNR/SSIM \gg degraded
Derivative-Free Opt (RL) Gradient error, reward, convergence speed Fastest, smallest norm-error
Image Generation (Curved) FID (NFE =2=2–$6$) Superior FID at few steps

This suggests that linear stochastic interpolants, especially when paired with efficient learning or Monte Carlo estimation of velocity and score fields, represent both a theoretically sound and empirically validated foundation for modern generative modeling, physical system emulation, multitask architectures, and black-box optimization (George et al., 1 Feb 2025, Duan et al., 13 Jan 2026, Negrel et al., 6 Aug 2025, Zhou et al., 30 Sep 2025, Berahas et al., 2019, Shankar et al., 26 Mar 2025, Albergo et al., 2023, Albergo et al., 2023).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Linear Stochastic Interpolants.