Bayesian Linear Inverse Problems

Updated 30 January 2026

Bayesian linear inverse problems are inference methods that recover unknown parameters from noisy, indirect measurements using probabilistic models and informative priors.
They leverage Gaussian priors, posterior contraction rates, and model-specific regularization to stabilize solutions in ill-posed, high-dimensional settings.
Advanced computational strategies such as low-rank samplers, transport map MCMC, and model reduction techniques enable scalable and efficient posterior estimation.

Bayesian linear inverse problems are inference tasks in which unknown variables, fields, or parameters are recovered from indirect, typically noisy measurements via a known or partially known forward model, under a probabilistic framework imposed by Bayesian statistics. Ill-posedness is intrinsic when the forward mapping is not stably invertible, which is frequently the case in high-dimensional scientific applications such as medical imaging, tomography, PDE-constrained inference, and machine learning. Bayesian techniques handle these challenges by regularizing with informative priors, quantifying uncertainty via posterior distributions, and often enabling adaptive algorithms that exploit problem geometry or model reduction.

1. Mathematical Formulation

Let $u$ denote the unknown parameter (vector or function), $A$ the linear forward operator, and $y$ the observed data. The standard measurement model is

$y = A u + \eta,$

where $\eta$ is additive noise, typically modeled as Gaussian $\eta \sim \mathcal{N}(0, \Gamma_{\text{noise}})$ . A Gaussian prior is placed on $u$ : $u \sim \mathcal{N}(u_0, \Gamma_0),$ where $\Gamma_0$ is the prior covariance. The posterior distribution, given Bayes’ theorem, is again Gaussian: $p(u | y) = \mathcal{N}(u_{\text{post}}, \Gamma_{\text{post}})$ with

$\Gamma_{\text{post}}^{-1} = \Gamma_0^{-1} + A^T \Gamma_{\text{noise}}^{-1} A, \quad u_{\text{post}} = \Gamma_{\text{post}} \left(A^T \Gamma_{\text{noise}}^{-1} y + \Gamma_0^{-1} u_0\right).$

Extensions develop these ideas rigorously in function space settings for infinite-dimensional unknowns and forward operators derived from PDEs (Neuberger et al., 21 Jan 2026, Chowdhary et al., 2023, Sun et al., 2021).

2. Regularization, Priors, and Posterior Well-Posedness

Bayesian regularization is achieved through the choice of prior distributions that encode assumptions (smoothness, sparsity, discontinuities) about the unknown $u$ . In function space, priors such as Gaussian random fields (Matérn, squared-exponential), total variation (TV), or fractional TV-Gaussian (FTG) hybrids are employed. The well-posedness of the posterior is guaranteed under conditions such as bounded linear $A$ , full-rank noise covariance, and lower-semicontinuous regularization functionals growing suitably in the norm of $u$ (Sun et al., 2021).

For example, the FTG prior is

$\pi_{\text{FTG}}(u) \propto \exp[-\alpha \text{FTV}_s(u)]\,\mathcal{N}(0,C),$

where $\text{FTV}_s$ is the fractional total variation. Hierarchical Bayesian models incorporate hyperpriors for regularization parameters, supporting empirical Bayes or fully Bayesian treatment (Zhou et al., 2017, Zhang et al., 2013).

3. Ill-Posedness, Regularity Scales, and Posterior Contraction

The severity of ill-posedness is determined by the smoothing properties and singular value decay of $A$ , driving the rates at which the posterior contracts around the true solution. Formally, if $A$ smooths by exponent $\gamma$ and the prior (or truth) belongs to a function space of regularity level $s$ , then the Bayesian estimator contracts in norm at rate

$\eta_n \sim n^{-s/(2\gamma + 2s + d)}$

for dimension $d$ (Gugushvili et al., 2018, Gugushvili et al., 2018). Non-conjugate series priors, Gaussian priors, and mixtures yield adaptive rates; under- or oversmoothing prior choices affect frequentist coverage and bias. In models with unknown forward operators or parameters, contraction rates familiar from direct inversion are recovered through empirical Bayes or Lepski-type adaptation (Trabs, 2018).

4. Dimension Reduction and Model Reduction Techniques

High-dimensional inverse problems often admit intrinsic low-dimensional structure due to the data only informing specific, data-driven subspaces of the parameter space. Likelihood-informed subspace (LIS) techniques identify the prior-preconditioned Fisher information operator

$H = \Gamma_{\text{pr}}^{1/2} G^T \Gamma_{\text{obs}}^{-1} G \Gamma_{\text{pr}}^{1/2}$

and construct rank- $r$ LIS bases by solving associated generalized eigenproblems (König et al., 30 Jun 2025, Spantini et al., 2014). Posterior distributions can then be projected and approximated optimally in terms of covariance and mean, in the sense of minimizing Riemannian, KL, or Bayes-risk distances.

For linear dynamical systems, balanced truncation (BT) and its prior-driven variants yield reduced-order models that are theoretically guaranteed to approximate the output error and posterior covariance (König et al., 30 Jun 2025). Such model reduction dramatically accelerates sampling and estimation in large-scale settings.

5. Computational Algorithms and Scalability

Bayesian linear inverse problems motivate efficient algorithms for posterior computation:

Low-rank Independence Samplers: Propose samples from approximated Gaussian posteriors based on low-rank truncation of the prior-preconditioned Hessian, attaining high acceptance rates and mesh-independent mixing efficiency (Brown et al., 2016).
Transport Map MCMC: Use polynomial, triangular, or diagonal transport maps to develop mesh-independent proposals for non-Gaussian posteriors (e.g., TG or FTG), enabling scalability and rapid mixing (Sun et al., 2021).
Projected Newton and Subspace Projection Regularization: Solve for the regularized MAP estimator via iterative projection onto Krylov subspaces, employing generalized Golub-Kahan bidiagonalization to avoid large matrix inversions and exploit early stopping as a regularization parameter (Li, 2024, Li, 2023).
Approximate Empirical Bayes: Evaluate marginal likelihoods using low-rank surrogate updates and randomized SVD, supporting fast hyperparameter adaptation and hierarchical modeling in large-scale problems (Zhou et al., 2017).

These techniques rely on matrix-free implementations, randomized solvers, and analytic formulas for linear Gaussian models, often leveraging closed-form expressions and low-rank decompositions for both covariance and mean computation.

6. Modern Extensions and Applications

Bayesian linear inverse methodology extends to several advanced directions:

Score-Based Generative Priors and Diffusion Models: Sequential Monte Carlo guided diffusion algorithms (MCGdiff) leverage pretrained denoising-diffusion models as informative priors, with provable consistency for posterior sampling in high-dimensional ill-posed tasks such as superresolution and deblurring (Cardoso et al., 2023).
Goal-Oriented Inference and Experimental Design: Optimal approximations for quantities of interest via low-rank updates, efficient calculation of posterior uncertainties in linear functionals, and path-wise experimental design for mobile sensors in PDE-governed domains (Spantini et al., 2016, Neuberger et al., 21 Jan 2026).
Bayesian Model Parameter Learning: Bayesian Approximation Error (BAE) methods address uncertainties in model parameters by learning error subspaces and jointly estimating sources and induced approximation errors, as demonstrated in EEG source imaging (Koulouri et al., 7 Jan 2025).
Variational Gaussian Processes: Variational Bayes methods using spectral inducing features attain optimal contraction rates for GP priors in mildly and severely ill-posed inverse problems, accelerating inference for large datasets (Randrianarisoa et al., 2023).
Sensitivity Analysis: Derivative-based procedures quantify the impact of auxiliary parameters on the information gain in infinite-dimensional inverse problems, combining low-rank eigenvalue sensitivity, adjoint solves, and post-optimal analysis (Chowdhary et al., 2023).

Canonical application areas encompass medical and geophysical imaging, PDE-constrained parameter estimation, source localization, and scientific computing, integrating rigorous statistical uncertainty quantification with scalable, structure-exploiting algorithms (Zhang et al., 2013).

7. Bernstein-von Mises, Efficiency, and Frequentist Validity

Bayesian inference in ill-posed linear inverse problems often exhibits semi-parametric efficiency: under regularity and contraction conditions, the marginal posterior for scalar (or finite-dimensional) parameters such as thermal diffusivity or blur-location converges, in a Bernstein-von Mises sense, to a Gaussian distribution at the optimal Cramér-Rao rate (Magra et al., 2023, Bochkina et al., 2011). Critical conditions include the correct matching of prior smoothness to truth regularity and the insensitivity of the prior to least-favorable directions. Consequently, Bayesian credible intervals for scalar parameters possess asymptotic frequentist validity, even when the full inverse problem remains ill-posed and nonregular.

Bayesian linear inverse problems represent a rich and well-developed field, integrating functional analysis, statistical inference, numerical linear algebra, control theory, and machine learning to deliver principled regularization, quantifiable uncertainty, and computational tractability for high-dimensional and ill-posed inference tasks.