gDDIM: Generalized Denoising Diffusion Models

Updated 12 February 2026

gDDIM is a generalized framework for denoising diffusion models that adapts integration schemes to arbitrary linear processes for accelerated sampling.
It employs a reparameterized score function and exponential integrators to produce deterministic or controlled stochastic sample paths.
gDDIM preserves forward process marginals while offering a tunable trade-off between sample diversity and deterministic quality in generative modeling.

Generalized Denoising Diffusion Implicit Models (gDDIM) are a flexible class of accelerated generative samplers extending the Denoising Diffusion Implicit Model (DDIM) framework to cover arbitrary linear, continuous-time diffusion processes. While DDIMs yield deterministic or low-stochasticity sample trajectories for isotropic (homogeneous) diffusions, gDDIM enables exact or approximate fast sampling for general, including non-isotropic, diffusions by adapting the parameterization and numerical integration schemes. This approach achieves accelerated high-fidelity generative modeling in settings where traditional DDIM methods are inapplicable, such as blurring diffusion and critically damped Langevin systems, while also providing a principled trade-off between diversity and deterministic sample quality (Zhang et al., 2022, Han, 2024, Sheng et al., 12 Oct 2025).

1. Core Principles and Problem Setting

A denoising diffusion model consists of a forward noising process—typically a stochastic differential equation (SDE) $dx_t = f_t(x_t) dt + g_t dW_t$ that transforms data $x_0$ into highly noisy samples $x_T$ (e.g., standard normal). The generative procedure requires simulating a reverse (typically intractable) SDE that starts from noise and recovers data. DDIM accelerated inference for isotropic models by replacing the stochastic reverse process with a deterministic ODE (“probability-flow ODE”) and providing an exact one-step integration in certain conditions. However, many practical diffusions—including blurring, coupled, or non-diagonal noise models—do not satisfy the isotropy requirement. gDDIM generalizes the DDIM method to arbitrary linear diffusion models by means of a diffusion-aware score parameterization and integration scheme, enabling implicit, non-stochastic, or controlled-stochastic sample paths compatible with the physical marginals of the forward process (Zhang et al., 2022, Han, 2024).

2. Theoretical Framework and Generalization

The forward SDEs for arbitrary linear DMs can be summarized as:

$dx_t = f_t(x_t) dt + g_t dW_t$

where $f_t$ and $g_t$ are, in general, time-dependent and possibly non-diagonal. The corresponding probability-flow ODE for sample generation is:

$dx_t = [f_t(x_t) - \tfrac{1}{2}g_t^2 s_\theta(x_t, t)] dt$

with $s_\theta(x_t, t)$ denoting a neural approximation of the score function $\nabla_x \log p_t(x)$ . For general (non-isotropic) $g_t$ , gDDIM introduces a matrix $x_0$ 0 satisfying $x_0$ 1 (the marginal covariance at time $x_0$ 2), reparameterizes the score network as $x_0$ 3, and implements implicit integration via an exponential integrator. For deterministic sampling, the update

$x_0$ 4

replaces the isotropic DDIM update, where $x_0$ 5 is the ODE transition matrix and all terms follow from the linear SDE structure. For stochastic sampling, an additional noise term parameterized by $x_0$ 6 enables continuous interpolation between deterministic (ODE) and stochastic (SDE/ancestral) regimes. This construction supports preservation of the forward process marginals for arbitrary noise levels and model structures (Zhang et al., 2022, Han, 2024, Sheng et al., 12 Oct 2025).

3. Algorithmic Implementation

gDDIM sampling is formulated as an explicit multi-step predictor–corrector exponential integrator. The steps are:

Precompute transition matrices $x_0$ 7 and noise-scaling matrices $x_0$ 8 for all time steps.
Execute, for each reverse timestep $x_0$ $x_{0}$ 9:
- Predictor: Produce a preliminary $x_T$ 0 using a polynomial fit over past $x_T$ 1 evaluations weighted by precomputed integrals.
- Corrector: Refine $x_T$ 2 using time-interpolated $x_T$ 3 values.
- For stochastic variants, inject noise with scale matched to the desired stochasticity parameter ( $x_T$ 4 or $x_T$ 5).
Repeat until reaching $x_T$ 6.

For standard score-based models: $dx_t = f_t(x_t) dt + g_t dW_t$ 8 The stochasticity parameter (e.g., $x_T$ 7 or $x_T$ 8) can be scheduled or fixed, with $x_T$ 9 yielding DDIM, $dx_t = f_t(x_t) dt + g_t dW_t$ 0 recovering DDPM, and $dx_t = f_t(x_t) dt + g_t dW_t$ 1 providing “super-stochastic” paths (Sheng et al., 12 Oct 2025).

4. Marginal Preservation, Variance Control, and Diversity-Speed Trade-off

A principal property of gDDIM, formalized in (Sheng et al., 12 Oct 2025, Han, 2024), is the preservation of marginals for any value of the stochasticity parameter. For each reverse step, the transition kernel is constructed to ensure that the distribution of $dx_t = f_t(x_t) dt + g_t dW_t$ 2 given $dx_t = f_t(x_t) dt + g_t dW_t$ 3 matches the corresponding forward marginal, allowing for controlled stochasticity without introducing bias. The stochasticity parameter provides an explicit, tunable trade-off: increasing it enhances exploration and sample diversity at the cost of speed and determinism; decreasing it yields faster, high-fidelity, less-diverse samples.

In RLHF-driven fine-tuning applications, the “reward gap” between samples generated by stochastic (SDE) and deterministic (ODE/DDIM) samplers is theoretically bounded and empirically converges to zero as the number of denoising steps increases. For Gaussian Variance Exploding (VE) and Variance Preserving (VP) models, analytic expressions show the gap vanishes as $dx_t = f_t(x_t) dt + g_t dW_t$ 4, supporting the common ODE-inference practice after stochastic fine-tuning (Sheng et al., 12 Oct 2025).

5. Extensions: Mixture Kernels and Principal-Axis Schemes

Recent work further generalizes gDDIM by introducing mixture-of-Gaussian reverse kernels (GMM-gDDIM) (Gabbur, 2023) and principal-axis DDIM (paDDIM) (Han, 2024):

In GMM-gDDIM, the reverse transition is modeled as a mixture $dx_t = f_t(x_t) dt + g_t dW_t$ 5, constrained to exactly match first and second moments of the DDPM marginals for enhanced performance in fast (few-step) settings.
paDDIM decomposes the diffusion operator along individual principal axes of the data covariance, allowing adaptive step sizes and noise allocation along each direction.

Empirical studies demonstrate that mixture-based gDDIM achieves lower FID and higher IS at minimal computational cost increase for small $dx_t = f_t(x_t) dt + g_t dW_t$ 6 (e.g., $dx_t = f_t(x_t) dt + g_t dW_t$ 7), while principal-axis scheduling can further accelerate convergence and fine-tune fidelity-diversity balance when the data distribution is low-rank (Gabbur, 2023, Han, 2024).

6. Empirical Results and Practical Considerations

Validated on non-isotropic models such as Blurring Diffusion Models (BDM) and Critically Damped Langevin Diffusion (CLD) on CIFAR-10, deterministic gDDIM achieves an FID of 2.49 with only 50 steps (ca. 20× speedup) for BDM, and FID of 2.26/2.86 with 50/27 steps (40–80× reduction in score function evaluations) for CLD, matching or surpassing high-step-count stochastic and ODE integrators. Network architectures generally follow adaptive UNet backbones with standard normalization and ResBlocks; all key ODE coefficients are precomputed for efficient GPU implementation (Zhang et al., 2022). The overall wall-clock time increases moderately (linearly in mixture size for GMM-gDDIM), but the sample quality improvements are substantial for small step regimes (Gabbur, 2023).

7. Applications, Limitations, and Future Directions

gDDIM provides a uniform framework for fast, high-quality sampling in general diffusion models, admitting user-controlled tuning of sample diversity and determinism, and preserving statistical consistency in both discriminative (RLHF) and classic generative modeling domains (Zhang et al., 2022, Sheng et al., 12 Oct 2025). Limitations arise in high-dimensional scenarios requiring expensive matrix factorization or regression in the mixture or principal-axis extensions, and in integrating auxiliary guidance signals (e.g., classifier-free guidance) without incurring perceptible computational overhead (Shah et al., 2024). Promising future directions include hybrid stochastic-deterministic schedulers, distilled implicit guidance networks, and adaptive variance allocation along principal data modes (paDDIM) (Han, 2024). The framework remains directly extensible to non-equilibrium settings and admits principled application to new physically-motivated or structured diffusion models.

References

"gDDIM: Generalized denoising diffusion implicit models" (Zhang et al., 2022)
"DDIM Redux: Mathematical Foundation and Some Extension" (Han, 2024)
"Understanding Sampler Stochasticity in Training Diffusion Models for RLHF" (Sheng et al., 12 Oct 2025)
"Improved DDIM Sampling with Moment Matching Gaussian Mixtures" (Gabbur, 2023)
"Enhancing Diffusion Models for High-Quality Image Generation" (Shah et al., 2024)