Langevin Samplers Overview

Updated 17 January 2026

Langevin samplers are a class of MCMC methods that use stochastic differential equations to sample complex, high-dimensional probability distributions.
They incorporate nonreversible, geometric, and preconditioned approaches that improve mixing speed, reduce variance, and enhance convergence.
Practical implementations include ULA, MALA, and underdamped variants, enabling robust applications in Bayesian inference, machine learning, and physics.

Langevin samplers are a broad class of Markov Chain Monte Carlo (MCMC) methods that utilize stochastic differential equations (SDEs), specifically Langevin-type diffusions, to facilitate efficient sampling from complex, high-dimensional probability distributions. These methods have become central to computational Bayesian inference, statistical mechanics, machine learning, and computational physics—especially when the target measure is only known up to normalization and exhibits nontrivial geometry or multimodality. Recent developments have produced numerous samplers that generalize or enhance the basic Langevin approach, notably via geometric and nonreversible perturbations, coupling with auxiliary variables, and algorithmic adaptations for discrete, non-Euclidean, or high-dimensional spaces.

1. Theoretical Foundations and Classical Langevin Samplers

The canonical setting is sampling from a probability density $\pi(x) \propto e^{-U(x)}$ , where $U: \mathbb{R}^d \to \mathbb{R}$ is a sufficiently regular potential. The foundational algorithm is the overdamped Langevin SDE: $dX_t = -\nabla U(X_t) \, dt + \sqrt{2} \, dW_t,$ which is ergodic with respect to $\pi$ under mild conditions (smoothness and confining potential ensuring a Poincaré inequality) (Duncan et al., 2015). The time-discretized version is the Unadjusted Langevin Algorithm (ULA): $X_{k+1} = X_k - h \nabla U(X_k) + \sqrt{2h}\, \xi_k, \quad \xi_k \sim N(0, I),$ which results in a biased invariant measure for any fixed $h>0$ , but can be corrected with a Metropolis–Hastings adjustment (Metropolis-adjusted Langevin Algorithm, MALA) to achieve exactness.

Underdamped (kinetic) Langevin samplers extend this to phase space $(q, p)$ with auxiliary momenta: $\begin{cases} dq_t = M^{-1} p_t\, dt, \ dp_t = -\nabla V(q_t) dt - \gamma M^{-1} p_t dt + \sqrt{2\gamma}\, dW_t, \end{cases}$ preserving the Gibbs distribution $\propto e^{-V(q) - \frac{1}{2} p^\top M^{-1} p}$ (Duncan et al., 2017, Schuh et al., 2024). Discretizations—Euler, splitting methods (BAOAB/OBABO)—permit exact inference when equipped with rejection sampling, and yield better scaling in high dimensions.

2. Nonreversible, Irreversible, and Variance-Reducing Dynamics

Standard Langevin dynamics are reversible with respect to $\pi$ , enforcing detailed balance. Introducing nonreversible drifts—vector fields $U: \mathbb{R}^d \to \mathbb{R}$ 0 such that $U: \mathbb{R}^d \to \mathbb{R}$ 1—breaks time-reversal symmetry without altering the invariant measure. A typical choice is $U: \mathbb{R}^d \to \mathbb{R}$ 2, where $U: \mathbb{R}^d \to \mathbb{R}$ 3 is skew-symmetric (Duncan et al., 2015, Duncan et al., 2017).

Nonreversible dynamics provide the following advantages:

Variance Reduction: For any observable $U: \mathbb{R}^d \to \mathbb{R}$ 4, the asymptotic variance $U: \mathbb{R}^d \to \mathbb{R}$ 5 in the ergodic average estimator strictly decreases as nonreversibility strength $U: \mathbb{R}^d \to \mathbb{R}$ 6 increases, except for observable-specific exceptional cases (Duncan et al., 2015, Rey-Bellet et al., 2014, Rey-Bellet et al., 2014).
Faster Mixing: The spectral gap of the generator improves monotonically with nonreversibility, providing hypocoercivity and exponential convergence to equilibrium (Duncan et al., 2015, Rey-Bellet et al., 2014).
Enhanced Exploration: In multimodal or "narrow-channel" problems, nonreversible perturbations dramatically accelerate barrier crossing and yield lower estimator mean square error by orders of magnitude.

The limiting behavior with strong nonreversibility ( $U: \mathbb{R}^d \to \mathbb{R}$ 7) can be interpreted via averaging theory: fast dynamics along potential level sets induce an effective slow diffusion on a "potential graph" defined by the topology of $U: \mathbb{R}^d \to \mathbb{R}$ 8, reducing the effective dimensionality and further collapsing the variance (Rey-Bellet et al., 2014, Lu et al., 2016).

Numerically, these benefits are best realized using integration schemes—such as Lie–Trotter or Strang splitting—that alternate between the reversible and irreversible parts, and include appropriate stepsize control to balance bias and stability (Duncan et al., 2017, Duncan et al., 2015). Explicit pseudocode for such integrators is given in (Duncan et al., 2017).

3. Geometric, Riemannian, and Preconditioned Approaches

Many sampling problems suffer from ill-conditioning and anisotropic geometry. Geometric variants of Langevin samplers precondition the diffusion by a position-dependent metric tensor $U: \mathbb{R}^d \to \mathbb{R}$ 9, e.g., based on Hessian, Fisher, or transport maps (Srinivasan et al., 2024, Kleppe, 2015, Zhang et al., 2023):

Riemann Manifold Langevin Dynamics (RMLD/MALA):

$dX_t = -\nabla U(X_t) \, dt + \sqrt{2} \, dW_t,$ 0

possibly with additional divergence-correction terms to guarantee invariance (Zhang et al., 2023, Kleppe, 2015).

Preconditioned MALA and MAPLA: Preconditioning by self-concordant or barrier-based metrics can significantly improve mixing times and dimension dependence (Srinivasan et al., 2024).

Efficient adaptation of local step size and curvature estimation (e.g., adaptive Cholesky-factor Hessian metrics, energy error diagnostics) can address numerical instability in flat or poorly conditioned regions and automate tuning (Kleppe, 2015).

Transport-map-based Langevin samplers employ learned invertible maps to "Gaussianize" the target, after which standard Langevin algorithms are applied, possibly with geometric correction terms. This approach can yield lower stationary bias and faster convergence in non-Gaussian and multimodal scenarios, subject to the accuracy and representation power of the map (Zhang et al., 2023).

4. Langevin Samplers for Discrete, Heavy-Tailed, and Non-Euclidean Spaces

Recent advances have broadened the applicability of Langevin samplers to nontraditional domains:

Discrete Spaces: By constructing discretized analogues of Langevin proposals, such as discrete Langevin proposals (DLP) (Zhang et al., 2022) or gradient-flow-based Discrete LMC (DLMC) (Sun et al., 2022), gradient-informed parallel-updating samplers can be implemented for high-dimensional binary/categorical models. These approaches enable efficient global moves, improved mixing, and provable convergence in log-quadratic (or near-quadratic) settings, outperforming classical Gibbs or block-Gibbs samplers.
Heavy-Tailed Distributions: For targets with only polynomially decaying tails, the Transformed ULA (TULA) (He et al., 2022) employs explicit diffeomorphisms to map the heavy-tailed target into a light-tailed surrogate, enabling application of standard Langevin methods and establishing polynomial oracle complexity bounds.
Non-Euclidean Manifolds: Sampling on constrained or product spaces (hypercubes, tori, spheres) is addressed by metrics, reparameterizations, or kernel-smoothing techniques that respect geometric structure, as in the Langevin birth–death sampler's support for hypercube/hypertorus domains (Leviyev et al., 2 Sep 2025).

5. Auxiliary Variable, Ensemble, and Birth–Death Dynamics

Several modern Langevin approaches leverage augmentations or couplings for improved efficiency:

Langevin Birth–Death Dynamics (LBD): Augments particle-based Langevin dynamics with a birth–death process, using kernel-smoothed mass-redistribution to accelerate convergence and ensure rapid recovery of multimodal measures. Preconditioning via Fisher information and adaptation to non-Euclidean spaces are integral (Leviyev et al., 2 Sep 2025).
Ensemble Samplers: Interacting-particle schemes, including the Ensemble Kalman Sampler (EKS) and affine-invariant Langevin dynamics (ALDI), utilize empirical covariance scaling and adaptive ensemble enrichment for efficient Bayesian inference in high-dimensional inverse problems. Homotopy-based interpolation of the potential enables robust convergence even in multimodal or severely ill-posed cases (Eigel et al., 2022).
Sequential Controlled Langevin Diffusions (SCLD): Unifies sequential Monte Carlo (SMC) with diffusion-based learned drifts, promoting robustness in multimodal, high-dimensional regimes by annealing and variance-minimizing control over an ensemble of continuous-time Langevin diffusions, with path-space resampling and MCMC refinement (Chen et al., 2024).
Stochastic Interpolant–Flow Approaches: Sampling by first constructing a probability-flow ODE between prior and target via a stochastic interpolant, with Langevin samplers used for velocity and initialization estimation at all stages (Duan et al., 13 Jan 2026).

6. Rigorous Convergence Guarantees and Complexity

Langevin-type samplers have seen extensive development in nonasymptotic convergence theory:

Log-Concave Case: For strongly convex and smooth targets, overdamped Langevin samplers achieve iteration complexity $dX_t = -\nabla U(X_t) \, dt + \sqrt{2} \, dW_t,$ 1 for Wasserstein or KL accuracy, while underdamped samplers improve this to $dX_t = -\nabla U(X_t) \, dt + \sqrt{2} \, dW_t,$ 2, matching non-private state of the art. Bounds on Rényi divergence enable strong differential privacy guarantees via composition, crucial for private Bayesian inference (Ganesh et al., 2020, Schuh et al., 2024, Lytras et al., 15 Sep 2025).
Non-Convex and Superlinear Drift: Properly tamed underdamped Langevin samplers with Lipschitz or superlinear gradients achieve contractivity (exponential convergence) in Wasserstein distance, as well as explicit, dimension-dependent step size and complexity bounds (Lytras et al., 15 Sep 2025, Schuh et al., 2024).
Bias–Variance Tradeoff and Step Sizing: Explicit expansions for discretization bias and higher-order variance shifts guide the selection of step size and tuning of irreversible perturbation strength, balancing improved mixing against discretization and stability constraints (Duncan et al., 2017, Duncan et al., 2015).

7. Practical Considerations, Implementation, and Applications

Key practical guidelines and algorithmic features include:

Integrator Selection: First- and second-order splitting integrators (e.g., BAOAB) allow larger steps and higher stability in underdamped and nonreversible settings, often with O( $dX_t = -\nabla U(X_t) \, dt + \sqrt{2} \, dW_t,$ 3) bias (Duncan et al., 2017, Duncan et al., 2017).
Preconditioning: Choice of preconditioner or metric can be made adaptive based on local Hessian, empirical Fisher, or learned transport maps, enabling near-affine invariance and robust performance in highly anisotropic targets (Srinivasan et al., 2024, Zhang et al., 2023, Leviyev et al., 2 Sep 2025).
Discrete/High-Dimensional/Latent Models: Parallelized and stochastic-gradient implementations of discrete Langevin proposals, ensemble and mean-field couplings, and hybrid methods scale to massive models (deep energy-based image models, Bayesian neural networks with $dX_t = -\nabla U(X_t) \, dt + \sqrt{2} \, dW_t,$ 4 variables) (Zhang et al., 2022, Eigel et al., 2022).
Non-Gaussian and Multimodal Problems: Nonreversible and annealed methods robustly recover all modes; birth–death, control-theoretic, and path-space approaches demonstrate state-of-the-art accuracy and efficiency in synthetic and real-world inverse problems (Leviyev et al., 2 Sep 2025, Chen et al., 2024, Eigel et al., 2022).

Empirical studies across benchmarks (e.g., multimodal densities, posterior distributions in gravitational wave inference, spatial point process models) confirm that state-of-the-art Langevin samplers decisively outperform basic reversible or local MCMC in mean square error, mixing speed, and effective sample size for equivalent computational budget (Duncan et al., 2015, Duncan et al., 2017, Leviyev et al., 2 Sep 2025, Chen et al., 2024).

References:

"Variance Reduction using Nonreversible Langevin Samplers" (Duncan et al., 2015)
"Nonreversible Langevin Samplers: Splitting Schemes, Analysis and Implementation" (Duncan et al., 2017)
"Variance reduction for irreversible Langevin samplers and diffusion on graphs" (Rey-Bellet et al., 2014)
"Irreversible Langevin samplers and variance reduction: a large deviation approach" (Rey-Bellet et al., 2014)
"Analysis of multiscale integrators for multiple attractors and irreversible Langevin samplers" (Lu et al., 2016)
"Efficient Bayesian Sampling with Langevin Birth-Death Dynamics" (Leviyev et al., 2 Sep 2025)
"Transport map unadjusted Langevin algorithms: learning and discretizing perturbed samplers" (Zhang et al., 2023)
"High-accuracy sampling from constrained spaces with the Metropolis-adjusted Preconditioned Langevin Algorithm" (Srinivasan et al., 2024)
"Adaptive step size selection for Hessian-based manifold Langevin samplers" (Kleppe, 2015)
"Contractive kinetic Langevin samplers beyond global Lipschitz continuity" (Lytras et al., 15 Sep 2025)
"Convergence of kinetic Langevin samplers for non-convex potentials" (Schuh et al., 2024)
"Heavy-tailed Sampling via Transformed Unadjusted Langevin Algorithm" (He et al., 2022)
"A Langevin-like Sampler for Discrete Distributions" (Zhang et al., 2022)
"Discrete Langevin Sampler via Wasserstein Gradient Flow" (Sun et al., 2022)
"Less interaction with forward models in Langevin dynamics" (Eigel et al., 2022)
"Sequential Controlled Langevin Diffusions" (Chen et al., 2024)
"Sampling via Stochastic Interpolants by Langevin-based Velocity and Initialization Estimation in Flow ODEs" (Duan et al., 13 Jan 2026)
"Faster Differentially Private Samplers via Rényi Divergence Analysis of Discretized Langevin MCMC" (Ganesh et al., 2020)
"Exact Langevin Dynamics with Stochastic Gradients" (Garriga-Alonso et al., 2021)
"Using Perturbed Underdamped Langevin Dynamics to Efficiently Sample from Probability Distributions" (Duncan et al., 2017)