Zero-Shot Diffusion Posterior Sampling
- Zero-shot diffusion posterior sampling is a framework that uses pretrained diffusion models to solve Bayesian inverse problems without retraining.
- It combines measurement guidance with diffusion priors via plug-and-play, deep unfolding, and MAP-based surrogates to achieve adaptable and efficient posterior sampling.
- Empirical evaluations show significant speed-ups and state-of-the-art fidelity across tasks like imaging, protein design, and compressed sensing through versatile inference methodologies.
Zero-shot diffusion posterior sampling is a family of computational methodologies that use pretrained unconditional diffusion models to solve Bayesian inverse problems, perform conditional generation, or sample from posterior distributions, without any retraining or fine-tuning for the specific forward model or measurement. These frameworks have gained broad traction in computational imaging, protein design, compressed sensing, and scientific simulation, exploiting the high expressiveness and flexibility of diffusion priors to address a wide range of inverse and conditional inference tasks.
1. Bayesian Formulation and Foundations
Zero-shot diffusion posterior sampling addresses the task of drawing samples from a posterior , where is the unknown signal (e.g., image, molecular structure) and is a measurement arising via a stochastic or deterministic forward model . The generative prior is only given implicitly by a pretrained diffusion model, typically a denoising score-based model or a Denoising Diffusion Probabilistic Model (DDPM).
The fundamental Bayesian equation is:
where encodes data fidelity or task-specific constraints, and is enforced via the diffusion prior. The goal is to construct practical Markov Chain Monte Carlo (MCMC), Langevin, or non-Markovian algorithms that produce samples (or optimizers) for this posterior, typically with minimal neural function evaluations (NFEs) and full adaptability to at test time (Mbakam et al., 3 Jul 2025).
2. Posterior Sampling Algorithms: Strategies and Deep Unfolding
The archetypal approach—Plug-and-Play (PnP) or Diffusion Posterior Sampling (DPS)—augments pretrained unconditional reverse processes with explicit measurement guidance. The core iteration (in discrete time) is:
- Proximal or gradient-adjusted step: Compute an update that moves in the direction of increased posterior density, via an approximation to the posterior score:
where the prior term is the diffusion model score, and the likelihood (measurement) term may be handled with a Jacobian chain rule or direct MAP/gradient ascent strategies (Mbakam et al., 3 Jul 2025, Xu et al., 31 Jan 2025, Li et al., 13 Mar 2025, Ahmed et al., 24 Nov 2025).
- Unfolding and distillation: Recent work unfolds a Markov chain MCMC sampler (e.g., LATINO Langevin (Mbakam et al., 3 Jul 2025)) into a feed-forward deep network, allowing a fixed number of highly optimized conditional denoising steps. Model distillation, using a combination of , perceptual, and adversarial losses, enables these few-step samplers to nearly match the accuracy of fully conditional, supervised-trained models while retaining adaptability to the forward model at test time.
- MAP-based surrogates and non-backprop-based methods: Several works highlight that standard DPS is not true posterior sampling but effectively a (possibly projected, multi-step) MAP optimizer. Algorithms such as DMAP (Xu et al., 31 Jan 2025), RePS (Ahmed et al., 24 Nov 2025), and efficient DDIM-MAP hybrids (Li et al., 13 Mar 2025) replace expensive backpropagation through the diffusion model with explicit MAP solves (quadratic for linear problems), dramatically reducing inference cost.
- Sequential Monte Carlo and particle filtering: For tasks such as protein motif scaffolding, SMC and particle-based guides correct zero-shot diffusion posteriors, allowing for multi-modal posterior sampling, especially in cases where simple score guidance yields poor sample diversity (Young et al., 2024).
3. Algorithmic and Architectural Variants
Zero-shot diffusion posterior sampling admits a spectrum of algorithmic forms, including but not limited to:
- DDRM-style spectral/posterior projections: At every reverse step, project the sample onto the data-consistent manifold using the pseudo-inverse, suitable for linear problems (Elata et al., 2024, Elata et al., 2024).
- Jacobian and Hessian approximations: ZAPS (Alçalar et al., 2024) uses diagonalized Hessian approximations and step-specific learned weights to make adjustments tractable under tight computational budgets.
- Decoupled likelihood surrogates: DING (Moufad et al., 20 Dec 2025) proposes closed-form Gaussian surrogate transitions, sidestepping vector-Jacobian products through the denoiser, leading to lower memory and runtime for pixel-conditional inpainting.
- Restart and non-Markovian updates: RePS (Ahmed et al., 24 Nov 2025) periodically re-injects noise via a restart strategy, contracting error accumulation in approximate posterior ODE solvers.
- Amortized variational inference: LAVPS (Zheng et al., 6 Feb 2026) amortizes variational inner loops with learnable inference models, providing explicit likelihood-guided posterior correction while achieving inference speed-ups.
- Direct modifications of the denoiser network: InvFussion (Elata et al., 2 Apr 2025) integrates the degradation operator into attention and feature blocks, enabling computational gains and flexible operation as a posterior sampler, MMSE predictor, or NPPC extractor.
4. Empirical Performance, Efficiency, and Adaptability
Posterior samplers based on zero-shot diffusion are extensively validated across imaging tasks (e.g., deblurring, inpainting, super-resolution), scientific measurements (e.g., seismic inverse problems), protein design, statistical downscaling, and compressed sensing applications (Mbakam et al., 3 Jul 2025, Young et al., 2024, Tie et al., 29 Jan 2026, Elata et al., 2024, Elata et al., 2024).
Key empirical findings include:
- Speed: Few-step unfolded samplers require – NFEs, achieving $10$– speed-ups over thousand-step DDPM baselines, while maintaining state-of-the-art performance in PSNR, LPIPS, and FID (Mbakam et al., 3 Jul 2025, Ahmed et al., 24 Nov 2025, Li et al., 13 Mar 2025).
- Adaptivity: Zero-shot methods generalize across unseen noise levels and forward operators without any retraining, outperforming fine-tuned conditional models in OOD contexts (e.g., mismatch in degradation or operator) (Mbakam et al., 3 Jul 2025).
- Flexibility: PSC (Elata et al., 2024) enables bit-rate and distortion/perception trade-off selection at inference in image compression, unlike CNN-based compressors, by leveraging posterior samples through diffusion priors.
- Posterior diversity and fidelity trade-offs: While vanilla DPS and many plug-and-play methods converge to low-diversity, MAP-like solutions, hybrid schemes (DMAP, ZAPS, SMC) restore sample diversity and improve faithfulness to multimodal or non-Gaussian posteriors (Xu et al., 31 Jan 2025, Young et al., 2024).
5. Limitations, Open Directions, and Recent Theoretical Insights
Despite their efficacy, zero-shot diffusion posterior samplers face several limitations:
- Computational resource demands: Large pretrained DMs (hundreds of millions of weights) impose high GPU memory and energy requirements, even if the number of neural evaluations per sample is small (Mbakam et al., 3 Jul 2025).
- Score approximation error: Standard DPS guidance diverges from the true posterior score, often implementing a form of MAP optimization rather than genuine posterior sampling, leading to low sample diversity and biased score statistics (Xu et al., 31 Jan 2025).
- Task generality: Universal samplers tend to underperform task-specific models for heavily out-of-distribution or high-precision requirements, suggesting the need for richer architectures or hybrid training strategies (cf. RAM, InvFussion) (Mbakam et al., 3 Jul 2025, Elata et al., 2 Apr 2025).
- Requirement for explicit linear operators: Many sampling formulations (DDRM, MAP-based schemata) are practical only for linear (or efficiently-invertible) forward models; extension to fully nonlinear settings requires additional architectural or algorithmic innovations (Li et al., 13 Mar 2025, Ahmed et al., 24 Nov 2025).
- Prompt and measurement sensitivity: For problems involving large uncertainty, text-prompted fills, or scientific downscaling, prompt ambiguity or under-specified measurement functions can degrade correlation and sample realism (Moufad et al., 20 Dec 2025, Tie et al., 29 Jan 2026).
- Inner optimization cost: Some advanced samplers (MGDM, SMC) require solving inner optimization, sampling, or variational inference steps at each diffusion iteration, leading to increased resource burden unless amortized (as in LAVPS) (Zheng et al., 6 Feb 2026).
6. Application Domains and Representative Metrics
Zero-shot diffusion posterior sampling is applied in:
| Domain | Task Example | Key Metric(s) |
|---|---|---|
| Computational Imaging | Deblurring, super-resolution, inpainting | PSNR, LPIPS, FID |
| Scientific Simulation | Seismic/CT/MRI image reconstruction | SNR, SSIM |
| Protein Design | Motif scaffolding, symmetry constraint | RMSD, scRMSD, pLDDT, diversity |
| Compressed Sensing | Adaptive acquisition/compression | Bit-rate, PSNR, FID |
| Downscaling | Statistical climate model mapping | 99/95%-tile MAE/RMSE |
| Image Editing | Masked/caption-driven edits | LPIPS, CLIP score |
Empirically, zero-shot samplers (e.g., UDM (Mbakam et al., 3 Jul 2025), DING (Moufad et al., 20 Dec 2025), DMPS (Meng et al., 15 Jun 2025)) attain superior or state-of-the-art fidelity at substantially lower inference cost, and in high-noise or ill-posed settings exhibit robust generalization even with adaptive or learned guidance.
7. Future Prospects
Continued development focuses on:
- Extension to latent-space diffusion architectures (e.g., Stable Diffusion), reducing memory and compute cost (Mbakam et al., 3 Jul 2025).
- Enabling principled guidance for arbitrary linear and nonlinear forward models, including complex scientific and high-resolution applications (Moufad et al., 20 Dec 2025, Ahmed et al., 24 Nov 2025).
- Architectures supporting universal sampling across broad task classes with minimal loss in task-specific performance (Mbakam et al., 3 Jul 2025, Elata et al., 2 Apr 2025).
- Systematic benchmarking of sample fidelity, diversity, and computational budget (e.g., clinical imaging, geoscientific simulation, high-res generative compression) (Elata et al., 2024, Tie et al., 29 Jan 2026).
- Unification of supervised and zero-shot paradigms to optimize both performance and adaptability (e.g., InvFussion (Elata et al., 2 Apr 2025), deep unrolling (Mbakam et al., 3 Jul 2025)).
Zero-shot diffusion posterior sampling thus provides a flexible, computationally scalable, and increasingly robust framework for conditional sampling under arbitrary inference constraints—advancing practical posterior inference across scientific, engineering, and data-centric domains.