Affine-Invariant Ensemble MCMC
- Affine-Invariant Ensemble MCMC is a sampling method that maintains an ensemble of walkers and is invariant under any affine transformation.
- The algorithm employs stretch moves and other variants to efficiently explore high-dimensional, correlated, or anisotropic target distributions.
- It offers robust performance in Bayesian computations and function-space problems without needing gradient or Hessian information.
An affine-invariant ensemble Markov chain Monte Carlo (MCMC) algorithm is a class of MCMC methods that maintains an ensemble of parallel “walkers”, with proposal moves and acceptance mechanisms designed so that the sampler’s behavior is unaffected by any invertible affine transformation of the target density. These methods have become canonical tools for efficiently sampling high-dimensional, strongly correlated, or highly anisotropic probability distributions, particularly when access to gradients or Hessians is unavailable. Their defining property is affine invariance: all proposal statistics and acceptance ratios transform properly under for invertible and arbitrary vector .
1. Principles of Affine-Invariant Ensemble MCMC
The prototypical affine-invariant ensemble sampler is the Goodman–Weare (GW) “stretch move” (Hou et al., 2011, Coullon et al., 2020, Foreman-Mackey et al., 2012, Foreman-Mackey et al., 2019, Huijser et al., 2015). Suppose the target is a density on . An ensemble of walkers is jointly updated according to a product law . For each walker :
- Pick another walker uniformly at random.
- Draw a stretch factor from .
- Propose
- Accept with probability
Affine invariance follows because under any invertible , the distribution and dynamics of proposals and acceptances are unchanged; the proposal and acceptance Jacobian terms exactly cancel (Coullon et al., 2020, Hou et al., 2011, Foreman-Mackey et al., 2012, Foreman-Mackey et al., 2019).
2. Algorithmic Variants and Extensions
Multiple affine-invariant ensemble algorithms have been proposed, often differing by choice of move type, adaptation scheme, or by targeting broader problem classes:
- Stretch move (AIES): The original GW algorithm as summarized above (Coullon et al., 2020, Hou et al., 2011, Foreman-Mackey et al., 2012).
- Ensemble Slice Sampler: Proposes walker moves using adaptive, affine-invariant slice sampling directions determined by differences or covariance among walkers (Karamanis et al., 2020). Parallel updates across walker subgroups and length-scale adaptation yield both affine invariance and robust mixing.
- Penalised t-walk: Extends the affine-invariant t-walk with specialized “penalty” moves to cross isolated modes, maintaining affine invariance under arbitrary full-rank affine maps (Medina-Aguayo et al., 2020).
- Second-order and interacting Langevin ensemble dynamics: Preconditioned ensemble-based Langevin samplers (e.g., EKHMC and ALDI) perform covariance-adapted diffusive sampling and are provably affine invariant, leveraging ensemble statistics for both drift and stochastic terms (Liu et al., 2022, Beh et al., 25 Jun 2025).
- Infinite-dimensional generalizations: The functional ensemble sampler (FES) applies the AIES move on a fixed KL-truncated subspace and the pCN move on the infinite-dimensional orthogonal complement, achieving mesh-independent, gradient-free sampling in infinite-dimensional settings (Coullon et al., 2020). Other hybrid methods apply subspace-projected or covariance-inflated proposals to combine affine invariance and dimension-robustness (Dunlop et al., 2022).
3. Affine-Invariance: Theory and Proofs
For a proposal kernel and target density , affine invariance requires
for any invertible , , with the transformed target . For the stretch move,
transforms under as
Since the proposal and acceptance ratios are unchanged under the affine map, the process is invariant (Coullon et al., 2020, Foreman-Mackey et al., 2019, Hou et al., 2011).
Generalizations (e.g., to covariance-preconditioned Langevin SDEs) use similar arguments: the drift and noise are adapted by the ensemble covariance, which transforms as under , ensuring affine invariance of the Fokker–Planck operator (Liu et al., 2022, Beh et al., 25 Jun 2025).
4. Applications, Scalability, and Limitations
Affine-invariant ensemble samplers are widely used in astrophysics, Bayesian inverse problems, and high-dimensional data analysis due to their ability to handle strongly anisotropic targets without explicit covariance estimation (Hou et al., 2011, Foreman-Mackey et al., 2012, Coullon et al., 2020).
In infinite-dimensional or function-space problems, the FES algorithm applies AIES to a fixed finite subspace (chosen via Karhunen–Loève expansion) and pCN or other schemes for the complement, ensuring mesh-independent mixing rates and robust performance as resolution increases (Coullon et al., 2020). Subspace-adjusted hybrid samplers extend these ideas to adaptively select the most informative modes for affine-invariant moves (Dunlop et al., 2022).
Empirically, for moderate dimension , ensemble stretch-move samplers offer order-of-magnitude efficiency gains (as measured by integrated autocorrelation time/IAT) over random-walk or non-adaptive slice samplers, and are robust to affine ill-conditioning (Karamanis et al., 2020, Foreman-Mackey et al., 2012). In very high dimensions, however, the AIES suffers from ensemble collapse—rapidly shrinking walker variance and inadequate exploration—leading to biased variance estimates and slow convergence, as rigorously characterized in (Huijser et al., 2015). Effective ensemble sizes shrink and long burn-in followed by slow “re-expansion” phases can degrade sampling unless supplemented by external regularization or more sophisticated dynamics.
5. Computational Structure and Implementation
A distinctive feature is ensemble parallelism: updates to walkers in one subset are independent given the positions of the complementary ensemble, enabling trivially parallel implementations. In practice, packages such as emcee (Foreman-Mackey et al., 2012, Foreman-Mackey et al., 2019) realize these methods by splitting walkers into two groups, alternately updating each half using the current positions of the other, and exposing a modular interface for multiple affine-invariant and ensemble-adapted moves.
Tuning is minimal: the principal parameters are the stretch factor range parameter (usually ), the ensemble size (with recommended), and the dimension of the low-rank subspace in infinite-dimensional settings (Coullon et al., 2020, Foreman-Mackey et al., 2012). Diagnostics rely on ensemble-wide summaries; acceptance rates between $0.2$–$0.5$ and comparison of empirical variance across independent runs are standard.
6. Benchmarks and Empirical Behavior
The performance of different affine-invariant ensemble algorithms is problem-dependent. Table 1 summarizes selected results from (Karamanis et al., 2020, Huijser et al., 2015, Coullon et al., 2020).
| Algorithm | Setting (dim) | IAT / Efficiency Gain | Notes |
|---|---|---|---|
| AIES / GW stretch | AR(1) Gaussian () | IAT ≈ 5×10⁴ | Collapse/re-expansion in |
| Ensemble Slice | AR(1) Gaussian () | IAT ≈ 110 (×10-×20 gain) | Robust, affine-invariant |
| FES | Advection (M=10 KL) | IAT ≈ 1.5×10³ (×100 faster) | Infinite-dimensional, mesh-robust |
| FES | Langevin path rec. | IAT(log α) ≈ 1.2×10⁴ | Beats hybrid/adaptive Gaussian RW |
Empirical studies confirm that affine-invariant ensemble algorithms perform best for moderate-dimensional, highly anisotropic or correlated targets, and when gradient information is inaccessible or costly. In strongly multimodal/posteriors, extensions such as penalized moves or power/parallel tempering (penalized t-walk) enable improved global exploration (Medina-Aguayo et al., 2020). In high dimension or infinite-dimensional inverse problems, subspace-based or hybrid ensemble-pCN methods and FES offer mixing times that do not grow with discretization size (Coullon et al., 2020, Dunlop et al., 2022).
7. Current Directions and Theoretical Developments
Active research continues on addressing high-dimensional pathologies, integrating ensemble affine-invariant dynamics with gradient-informed (Langevin or Hamiltonian) moves (Beh et al., 25 Jun 2025, Liu et al., 2022), and extending affine-invariant adaptation to non-Gaussian priors, non-linear inverse problems, and rare-event or importance sampling regimes.
Theory now firmly establishes the dimension-robustness of affine-invariant moves on fixed-dimensional subspaces and in mean-field limits, but highlights the need for caution with naive application in unless coupled with hybridization, annealing, or carefully designed subspace updates (Huijser et al., 2015, Coullon et al., 2020, Dunlop et al., 2022).
Affine-invariant ensemble MCMC remains indispensable for “black-box”, large-scale, and function-space Bayesian computations that require tuning-free, derivative-free, and computationally scalable sampling.