Papers
Topics
Authors
Recent
Search
2000 character limit reached

Affine Invariant Langevin Dynamics (ALDI)

Updated 8 January 2026
  • Affine Invariant Langevin Dynamics (ALDI) is a framework using interacting particle SDEs with affine invariance to sample high-dimensional, anisotropic distributions effectively.
  • It employs ensemble covariance preconditioning to adapt both drift and diffusion, supporting gradient-based and gradient-free approaches in Bayesian inference and optimal design.
  • The method guarantees ergodicity with robust mixing properties, making it beneficial for inverse problems, rare-event estimation, and efficient experimental design.

Affine Invariant Langevin Dynamics (ALDI) is a class of interacting particle stochastic differential equations (SDEs) designed for sampling from high-dimensional, often anisotropic, target distributions with robust mixing properties and explicit invariance under affine transformations. Incorporating ensemble-based empirical covariance preconditioning, ALDI adapts both drift and diffusion to the evolving geometry of the sampled ensemble. ALDI admits gradient-based and gradient-free (derivative-free) formulations, making the methodology suitable for Bayesian inference, optimal experimental design, inverse problems, and rare-event estimation in settings where gradients of the log-target are unavailable or computationally prohibitive. The theory guarantees ergodicity and invariance under affine mappings provided the ensemble size exceeds the ambient dimension by at least one.

1. Mathematical Formulation

Let π(θ)exp(U(θ))\pi(\theta) \propto \exp(-U(\theta)) define the target posterior, where UU is a (possibly non-convex, non-smooth) potential. Standard overdamped Langevin dynamics employ

dθt=U(θt)dt+2dWt,d\theta_t = -\nabla U(\theta_t)\, dt + \sqrt{2}\, dW_t,

which exhibits slow mixing for ill-conditioned or anisotropic targets and requires explicit gradients.

ALDI generalizes this mechanism by evolving an ensemble {θt(j)Rd}j=1J\{\theta^{(j)}_t \in \mathbb{R}^d\}_{j=1}^J via the empirically estimated ensemble mean and covariance,

θt=1Jj=1Jθt(j),Ct=1Jj=1J(θt(j)θt)(θt(j)θt)T.\overline\theta_t = \tfrac{1}{J} \sum_{j=1}^J \theta^{(j)}_t,\quad C_t = \tfrac{1}{J} \sum_{j=1}^J (\theta^{(j)}_t - \overline\theta_t)(\theta^{(j)}_t - \overline\theta_t)^T.

The canonical ALDI SDE reads

dθt(j)=CtU(θt(j))dt  +  d+1J(θt(j)θt)dt  +  2Ct1/2dWt(j),d\theta_t^{(j)} = -C_t\, \nabla U(\theta_t^{(j)})\, dt \;+\; \frac{d+1}{J} (\theta_t^{(j)} - \overline\theta_t)\,dt \;+\;\sqrt{2}\,C_t^{1/2}\, dW_t^{(j)},

for independent dd-dimensional Brownian motions Wt(j)W_t^{(j)}. The (d+1)/J(d+1)/J “repulsion drift” restores affine invariance and prevents ensemble collapse.

Discrete Euler–Maruyama time-stepping yields

θn+1(j)=θn(j)ϵCnU(θn(j))+ϵd+1J(θn(j)θn)+2ϵCn1/2ξn(j),\theta_{n+1}^{(j)} = \theta_n^{(j)} - \epsilon\, C_n\, \nabla U(\theta_n^{(j)}) + \epsilon \frac{d+1}{J} (\theta_n^{(j)} - \overline\theta_n) + \sqrt{2\epsilon}\, C_n^{1/2} \xi_n^{(j)},

with ξn(j)N(0,Id)\xi_n^{(j)} \sim N(0,I_d). The non-symmetric square root Cn1/2C_n^{1/2} may be computed directly from the ensemble deviations, avoiding costly matrix factorizations (Garbuno-Inigo et al., 2019, Gruhlke et al., 17 Apr 2025).

Gradient-free ALDI replaces CtUC_t\nabla U with an ensemble Kalman-style cross-covariance update,

Cθ,GΓ1(yG(θ)),C_{\theta,G} \Gamma^{-1} (y - G(\theta)),

allowing sampling when only the forward model G(θ)G(\theta) is available (Gruhlke et al., 17 Apr 2025, Garbuno-Inigo et al., 2019, Chakraborty et al., 31 Dec 2025).

2. Affine Invariance Properties

A defining feature of ALDI is invariance under all invertible affine transformations θAθ+b\theta \mapsto A\theta + b, ARd×dA \in \mathbb{R}^{d \times d}. The empirical covariance and drift/diffusion terms transform so that the SDE retains its form in the new basis:

CθACθAT,θU(θ)ATϑUA(ϑ),C_\theta \mapsto A C_\theta A^T,\quad \nabla_\theta U(\theta) \mapsto A^T \nabla_\vartheta U_A(\vartheta),

and both drift and diffusion are automatically re-scaled. The repulsion drift (d+1)/J(θθ)(d+1)/J(\theta - \overline{\theta}) is manifestly affine-covariant.

As a consequence, ALDI achieves mixing rates independent of linear re-scaling, rotations, or anisotropy in π(θ)\pi(\theta), obviating manual preconditioner tuning required by standard Langevin or MALA samplers (Gruhlke et al., 17 Apr 2025, Garbuno-Inigo et al., 2019, Beh et al., 25 Jun 2025).

3. Theoretical Properties and Consistency

Provided the empirical covariance is initially strictly positive-definite and J>d+1J > d+1, ALDI admits unique global strong solutions and its ensemble law converges in total variation to the product target measure πJ\pi^{\otimes J} (Garbuno-Inigo et al., 2019, Beh et al., 25 Jun 2025, Chakraborty et al., 31 Dec 2025). Key results include:

  • Non-degeneracy: The ensemble covariance remains positive-definite for all tt, enforced by the finite-JJ correction drift.
  • Ergodicity: Under quadratic growth and smoothness conditions on UU, the process is irreducible and the stationary distribution is guaranteed.
  • Affine invariance: Proven for both continuous and discretized SDE systems.
  • Small noise consistency: For rare-event Bayesian posteriors, the ALDI-trajectory samples converge to the prior restricted to the failure set as the effective observation noise R0R \to 0, independently of the smoothing parameter δ (Chakraborty et al., 31 Dec 2025).

In the mean-field limit JJ \to \infty, the ALDI system induces a nonlinear gradient flow for KL-divergence in Wasserstein metric on the empirical measure space:

tπt=(πtC(πt)δKL(πtπ)δπt),\partial_t\pi_t = \nabla\cdot\left(\pi_t\, C(\pi_t)\, \nabla \frac{\delta KL(\pi_t \mid \pi_*)}{\delta\pi_t}\right),

where the covariance operator C(πt)C(\pi_t) is a function of the current empirical law (Garbuno-Inigo et al., 2019).

4. Algorithmic Implementation

Implementation is based on ensemble propagation via the Euler–Maruyama update, as detailed in the following generic pseudocode:

1
2
3
4
5
6
7
8
9
10
11
12
Initialize θ_0^{(j)} ∼ prior, j=1,…,J
For n = 0 to N - 1:
    Compute ensemble mean \overline θ_n and covariance C_n
    For each j = 1,…,J:
        Draw ξ_n^{(j)} ∼ N(0,I_d)
        Update
        θ_{n+1}^{(j)} = θ_n^{(j)}
          - ε C_n ∇U(θ_n^{(j)})
          + ε (d+1)/J (θ_n^{(j)} - \overline θ_n)
          + sqrt(2ε) C_n^{1/2} ξ_n^{(j)}
End
Output θ_N^{(j)} as approximate samples from π
For derivative-free applications, substitute the ensemble Kalman cross-covariance formula for CnUC_n\,\nabla U.

Computational costs scale as O(Jd2)O(J d^2) per ensemble step for covariance estimation, with O(d3)O(d^3) for square-root computation, assuming no low-rank or diagonal approximation. In practice, J2J \sim 2–$5 d$ is sufficient (Gruhlke et al., 17 Apr 2025, Eigel et al., 2022). Stability requires ϵ1/λmax(CnHessU)\epsilon \lesssim 1/\lambda_{\max}(C_n\, \mathrm{Hess}\,U), but ALDI achieves greater numerical stability than isotropic schemes due to adaptive preconditioning.

Adaptive ensemble enrichment strategies (e.g., LIDL) successively enlarge the particle set using random kicks, diffusion-only propagation, or transport maps, reducing costly forward evaluations for Bayesian inverse problems and achieving consistency in linear-Gaussian settings (Eigel et al., 2022).

5. Applications in Bayesian Inverse Problems and Rare Event Estimation

ALDI is extensively applied in Bayesian inference for complex forward models (e.g., PDE-constrained inverse problems) and in rare-event probability estimation, where the optimal importance sampling density is often non-differentiable (Beh et al., 25 Jun 2025, Chakraborty et al., 31 Dec 2025). For demonstration:

  • Bayesian Experimental Design (BOED): ALDI provides efficient, derivative-free posterior sampling for utility estimation, enabling scalable information-driven design in high dimensions (Gruhlke et al., 17 Apr 2025).
  • Rare-event sampling: ALDI is used to sample smoothed zero-variance densities derived by logistic or ramp approximations of limit-state functions, yielding proposal densities for importance sampling and error bounds for rare-event probabilities. Trade-offs in smoothing and discretization bias are rigorously characterized, with ALDI-based samplers outperforming MH or HMC subset simulation frameworks in high-dimensional settings (Beh et al., 25 Jun 2025, Chakraborty et al., 31 Dec 2025).
  • PDE-based Bayesian inverse problems: Both gradient-based and gradient-free ALDI samplers produce comparable bias and spread, with the gradient-free variant requiring substantially fewer gradient calls for large parameter dimensions (Garbuno-Inigo et al., 2019).

6. Numerical Performance and Extensions

Empirical studies confirm ALDI’s robust mixing and accuracy in linear Gaussian, nonlinear, and multimodal posteriors:

Test Case Dimensionality Ensemble Size Key Finding
Linear Gaussian d256d\leq 256 J=50J=50–$256$ Posterior recovered; rapid mixing
PDE-based Darcy flow d=50d=50, $101$ J=25J=25–$200$ Gradient-free ALDI matches bias of gradient-based
Rare-event hyperplane d=100d=100 M=50M=50 nRMSE 0.18\approx 0.18, outperforms MH/HMC-SuS
Atmospheric blocking ODE nonlinear J=102J=10^210310^3 Gaussian mixture IS achieves variance reduction

Increasing ensemble size improves accuracy and empirical consistency, and key metrics such as Sinkhorn divergence decay with J1/2J^{-1/2}. Homotopy extensions and enrichment schemes address multimodality and reduce forward model costs by up to 50%50\% compared to full ALDI runs (Eigel et al., 2022).

ALDI’s primary limitations are computational cost for covariance square-root updates in very high dimensions and imperfect ergodicity when Jd+1J \leq d+1 or in gradient-free implementations with complex forward models. Discretization bias must be controlled with small step sizes; mixture-based importance sampling can overfit with small JJ, and gradient-free ergodicity is not yet theoretically established (Chakraborty et al., 31 Dec 2025). ALDI is best suited for moderate-dimensional problems with strong anisotropy, expensive or unavailable gradients, and as a geometry-probing phase for proposal density construction.

Recommended applications include Bayesian experimental design, rare-event sampling near failure manifolds, PDE inference with black-box forward models, and scenarios requiring adaptive proposal distributions conditioned on informative subspaces. In all cases, ALDI’s affine invariance ensures robust performance under challenging geometric conditions.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Affine Invariant Langevin Dynamics (ALDI).