Bidirectional Monte Carlo (BDMC)

Updated 20 January 2026

Bidirectional Monte Carlo (BDMC) is a technique that uses forward and reverse AIS to generate stochastic sandwich bounds on the log marginal likelihood.
It provides a reliable diagnostic by bounding the divergence between approximate and true posterior distributions through the computed forward-reverse gap.
BDMC underpins practical methods like the BREAD protocol, which validates inference quality on both simulated and real data for robust model calibration.

Bidirectional Monte Carlo (BDMC) is a technique for obtaining accurate stochastic upper and lower bounds on the marginal likelihood (log-ML) or, more generally, for bounding the divergence between approximate and true posterior distributions in probabilistic models. BDMC operates by running annealed importance sampling (AIS) or sequential Monte Carlo (SMC) both in the forward (prior to posterior) and reverse (posterior to prior) directions, capitalizing on the reversibility of these algorithms when initialized with exact samples from the target distribution. This framework yields sandwich bounds on log marginal likelihood, facilitates precise model evidence estimation, and underpins key diagnostics for Markov chain Monte Carlo (MCMC) inference—especially within the context of simulated data where exact posterior samples are available (Grosse et al., 2016, Grosse et al., 2015).

1. Foundations: Annealed Importance Sampling and Marginal Likelihood Estimation

Marginal likelihood estimation is central to Bayesian model selection but typically involves intractable integrals over latent variables and parameters. Annealed importance sampling (AIS) addresses this challenge by defining a sequence of unnormalized densities $\{f_t(x)\}_{t=1}^T$ , each associated with a distribution $p_t(x) = f_t(x)/Z_t$ . The sequence interpolates from an initial distribution ( $p_1$ , e.g., the prior with known $Z_1$ ) to the posterior ( $p_T$ ) (Grosse et al., 2015). The process involves reversible MCMC kernels $T_t(\cdot|\cdot)$ to transition between successive distributions.

An AIS run produces a nonnegative unbiased estimate $\hat W$ of the partition function ratio $R = Z_T/Z_1$ and an approximate sample from the terminal distribution. The estimator has key properties:

$\mathbb{E}[\hat W] = R$ , thus $\log \hat W$ is a stochastic lower bound on $\log R$ .
The marginal distribution of the final AIS sample, $q(x_T)$ , does not generally coincide with $p_T$ due to potential lack of kernel mixing (Grosse et al., 2016).

2. The BDMC Method: Forward and Reverse AIS

BDMC extends AIS by running the chain in both directions—forward (prior to posterior) and reverse (posterior to prior)—exploiting the fact that, with an exact sample $x_T \sim p_T$ , one can run the same sequence backward to obtain unbiased importance weights for $Z_1/Z_T$ (Grosse et al., 2015).

Pseudocode outlining both passes:

Step	Forward AIS	Reverse AIS
Initialization	$x_1 \sim p_1$	$x_T \sim p_T$ (exact)
Weight update	$w \leftarrow w \cdot [f_t(x_{t-1})/f_{t-1}(x_{t-1})]$	$w \leftarrow w \cdot [f_{t-1}(x_t)/f_t(x_t)]$
State transition	$x_t \sim T_t(\cdot\|x_{t-1})$	$x_{t-1} \sim T_t(\cdot\|x_t)$
Output	$L = \log w_F$ (lower bound)	$U = -\log w_R$ (upper bound)

With these, BDMC provides:

$L \leq \log R$ (stochastic lower bound)
$U \geq \log R$ (stochastic upper bound)
As $T \rightarrow \infty$ and kernels mix, $L \rightarrow \log R \leftarrow U$ (Grosse et al., 2015, Grosse et al., 2016).

3. Divergence Bounding and Theoretical Guarantees

The gap $\Delta = U - L$ serves not only as a diagnostic for marginal likelihood estimation but also upper-bounds, in expectation, the symmetrized Kullback–Leibler (Jeffreys) divergence between the distribution of AIS samples $q(x_T)$ and the true posterior $p_T(x)$ . Formally,

$\mathbb{E}[U - L] \geq D_{\mathrm{KL}}(q||p_T) + D_{\mathrm{KL}}(p_T||q) \equiv J(q, p_T)$

Thus, $B \equiv \mathbb{E}[\Delta]$ is an upper bound on the Jeffreys divergence, making BDMC a tool not only for model evidence calibration but also posterior quality diagnostics (Grosse et al., 2016).

4. The BREAD Protocol for Real-Data Validation

Exact posterior samples are available on simulated data but not on real datasets. The BREAD (Bounding Divergences with REverse Annealing) protocol addresses this by:

Running the chosen inference method (e.g., Stan, WebPPL) on real data to estimate hyperparameters $\hat\theta$ .
Simulating synthetic data $\tilde{D} \sim p(D | \hat\theta)$ .
Performing BDMC on $\tilde{D}$ to compute $(L_{\mathrm{synth}}, U_{\mathrm{synth}})$ and $\Delta_{\mathrm{synth}}$ .
Comparing convergence curves for $L_{\mathrm{real}}(T)$ versus $L_{\mathrm{synth}}(T)$ across $T$ . If these are similar, one can trust $\Delta_{\mathrm{synth}}$ as a proxy for the true Jeffreys divergence on real data.
In hierarchical models, a brief MCMC warm-up initialized at $\hat\theta$ may approximate posterior draws for the reverse AIS pass; empirically, a small number of steps suffice (Grosse et al., 2016).

5. Implementation in Probabilistic Programming Systems

BDMC and BREAD require several system-level features:

Tempering hooks: Evaluating unnormalized log-densities at arbitrary inverse temperatures $\beta_t$ . Stan required a power-posterior API; WebPPL involved trace-based evaluation with a $\beta$ exponent on likelihood terms.
Reversible kernels: Choice of Metropolis–Hastings (WebPPL) or Hamiltonian Monte Carlo/No-U-Turn Sampler (HMC/NUTS in Stan), ensuring correct invariance per $p_t$ .
AIS weight tracking: Inference engine modifications to return $L$ and allow reverse-path execution.
Orchestration scripts: Automating the BREAD protocol—hyperparameter estimation, synthetic dataset simulation, BDMC runs, and curve comparison (Grosse et al., 2016).

Empirical studies demonstrated that the representation of latent-variable models (collapsed vs. uncollapsed) can have significant effects on convergence rates and computational cost, motivating BDMC/BREAD as practical design tools.

6. Empirical Performance and Comparative Findings

Application of BDMC to small-dimension latent-variable models (mixtures, matrix factorization, binary-attribute linear-Gaussian) showed the following:

The gap $\log U - \log L$ can routinely be driven below 1 nat, yielding near-oracle accuracy for log marginal likelihood.
The upper bound $B$ on the Jeffreys divergence is within 10–30% of the exact value, even with poor mixing.
Standard estimators (BIC, simple SIS, harmonic mean estimator, VB) were often inaccurate compared to BDMC, with only AIS, SMC (single-particle), and Nested Sampling achieving $\sim$ 10 nat RMSE efficiently (Grosse et al., 2015).
BDMC identified implementation bugs (e.g., in WebPPL's multivariate-Gaussian sampler) by detecting reversals of the expected $U<L$ inequality, underscoring its diagnostic value (Grosse et al., 2016).

7. Recommendations and Practical Guidelines

Effective use of BDMC requires:

Always simulating a small synthetic dataset from the model to obtain exact posterior samples for reverse AIS.
Monitoring and driving the forward–reverse gap $\Delta$ below a practical threshold (typically $\lesssim1$ nat).
Favoring sigmoidal annealing schedules for $\beta_t$ to allocate more annealing steps near the prior and posterior endpoints:

$\beta_t = \frac{\sigma(\delta(2t/T-1))-\sigma(-\delta)}{\sigma(\delta)-\sigma(-\delta)},\quad \sigma(u)=1/(1+e^{-u})$

Using the ground-truth log marginal likelihoods from BDMC to benchmark new estimators by mean squared error.
Recognizing that $\Delta$ also serves as a posterior quality diagnostic, upper-bounding the KL divergence (Grosse et al., 2015, Grosse et al., 2016).

BDMC thus provides both a method for accurately sandwiching marginal likelihood calculations and an empirical tool for measuring and improving posterior inference quality, particularly in the context of MCMC-based or probabilistic programming workflows.

Markdown Report Issue Upgrade to Chat

References (2)

Measuring the reliability of MCMC inference with bidirectional Monte Carlo (2016)

Sandwiching the marginal likelihood using bidirectional Monte Carlo (2015)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Bidirectional Monte Carlo (BDMC).

Bidirectional Monte Carlo (BDMC)

1. Foundations: Annealed Importance Sampling and Marginal Likelihood Estimation

2. The BDMC Method: Forward and Reverse AIS

3. Divergence Bounding and Theoretical Guarantees

4. The BREAD Protocol for Real-Data Validation

5. Implementation in Probabilistic Programming Systems

6. Empirical Performance and Comparative Findings

7. Recommendations and Practical Guidelines

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Bidirectional Monte Carlo (BDMC)

1. Foundations: Annealed Importance Sampling and Marginal Likelihood Estimation

2. The BDMC Method: Forward and Reverse AIS

3. Divergence Bounding and Theoretical Guarantees

4. The BREAD Protocol for Real-Data Validation

5. Implementation in Probabilistic Programming Systems

6. Empirical Performance and Comparative Findings

7. Recommendations and Practical Guidelines

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research