Learning the RMALA proposal covariance G0 via reinforcement learning

Develop reinforcement learning techniques to learn the proposal covariance matrix G0 in the Riemannian Metropolis–adjusted Langevin algorithm (RMALA), potentially through reduced-rank covariance parameterizations, to enable effective adaptive tuning of gradient-based MCMC proposals.

Background

The paper introduces Reinforcement Learning Metropolis–Hastings (RLMH) for adaptive tuning of gradient-based MCMC, focusing on RMALA where the preconditioner is taken as G(x) = ε(x){-1} G0 and the position-dependent step size ε(·) is learned via reinforcement learning. In all empirical evaluations, the static matrix G0 was fixed using gold-standard samples to avoid confounding.

The authors note that choosing or estimating G0 in practice is nontrivial (e.g., via Hessian approximations or adaptive covariance learning) and highlight that learning the proposal covariance structure within an RL framework remains unresolved. They speculate that reduced-rank covariance approximations could help, but initial attempts were unsuccessful, underscoring the need for methods that can robustly learn G0 online within RLMH or related RL-based adaptive MCMC frameworks.

References

For future research, we highlight that learning the proposal covariance structure (i.e. G0 in the setting of RMALA) is an open challenge for RL; we speculate that reduced-rank covariance matrix approximations may be useful here, enabling the difficulty of the learning task to be reduced, but our attempts to implement this (not shown) were unsuccessful.

Harnessing the Power of Reinforcement Learning for Adaptive MCMC  (2507.00671 - Wang et al., 1 Jul 2025) in Section 5 (Discussion)