Papers
Topics
Authors
Recent
Search
2000 character limit reached

Energy-Tweedie: Score meets Score, Energy meets Energy

Published 29 Dec 2025 in stat.ML and cs.LG | (2512.23818v1)

Abstract: Denoising and score estimation have long been known to be linked via the classical Tweedie's formula. In this work, we first extend the latter to a wider range of distributions often called "energy models" and denoted elliptical distributions in this work. Next, we examine an alternative view: we consider the denoising posterior $P(X|Y)$ as the optimizer of the energy score (a scoring rule) and derive a fundamental identity that connects the (path-) derivative of a (possibly) non-Euclidean energy score to the score of the noisy marginal. This identity can be seen as an analog of Tweedie's identity for the energy score, and allows for several interesting applications; for example, score estimation, noise distribution parameter estimation, as well as using energy score models in the context of "traditional" diffusion model samplers with a wider array of noising distributions.

Summary

  • The paper extends the classical Tweedie formula to elliptical noise, linking Stein scores with the path-derivative of energy scores.
  • It introduces an algebraic framework that unifies denoising, score estimation, and robust generative modeling under varying noise geometries.
  • Empirical validation on multimodal benchmarks confirms improved calibration, noise-adaptivity, and performance over traditional Gaussian models.

Energy-Tweedie: Extending Tweedie's Identity to Elliptical Noise and Energy Score Connections

Introduction

This paper introduces a generalization of the classical Tweedie’s formula, connecting score estimation and denoising in the presence of a broad class of noise distributions—specifically, elliptical distributions, also known as energy models. The main contribution is a set of identities that link the Stein score of the noisy marginal not only with the conditional expectation (as in the standard Tweedie formula) but with the path-derivative of energy-scoring rules, offering a general and algebraically grounded framework. The identities support strict properness requirements and parameterization via the Mahalanobis distance, extending existing theoretical apparatus and providing tools for estimation, calibration, and generative modeling under heavy-tailed or anisotropic noise.

Mathematical Background and Generalized Tweedie Formula

The core denoising context involves estimating a clean variable XX from noisy observations Y=X+ϵY = X + \epsilon, with ϵq\epsilon \sim q for some location-equivariant noise family qq. Traditionally, for Gaussian noise, the Tweedie formula links the posterior mean to the gradient of the log marginal likelihood, and methods such as score-matching and denoising autoencoders exploit this result.

The paper extends this machinery by:

  1. Generalizing Tweedie’s identity to elliptical distributions (i.e., energy-based models parameterized by centrally symmetric potentials over the Mahalanobis norm).
  2. Demonstrating that, for elliptical qq, the Stein score sm(y)s_m(y) of the noisy marginal m(y)=(pq)(y)m(y) = (p * q)(y) equals the negative path-derivative, with respect to yy, of the posterior expectation of the noise potential:

$s_m(y) = -\nabla^{PD}_y~\EE_{X \mid Y=y}[V(\|y-X\|_{\Sigma^{-1}})]$

This result recovers, as special cases, the Gaussian mean-seeking regime (β=2\beta=2) and the geometric-median-seeking regime for Laplace-like noise (β=1\beta=1).

Additionally, for generalized Gaussian noise (parameterized by shape β\beta and scale λ\lambda), the score formula incorporates nonlinearity in the residual, interpolating between mean and median behavior, and supporting heavy-tailed or robust denoising objectives.

Energy Score Identity and Its Consequences

A primary result is the identification of a new energy-score-based identity, which holds for generalized Gaussian (and more generally, for elliptical) noise:

sm(y)=λβ  yPD ESΣ1,β(P(XY=y),y)s_m(y) = -\frac{\lambda}{\beta} \; \nabla^{PD}_y~\mathrm{ES}_{\Sigma^{-1}, \beta}(P(X|Y=y), y)

where ESΣ1,β\mathrm{ES}_{\Sigma^{-1},\beta} is an energy score with Mahalanobis geometry and shape parameter β\beta. This extends the classical connection between the marginal score and reconstruction loss gradients beyond MSE to a much broader set of strictly proper scoring rules. For the Gaussian case, this recovers the original Tweedie formula, as the energy score reduces to the MSE.

This relation has strong implications:

  • Score Estimation: Provides a consistent method for estimating the Stein score via energy-score gradients evaluated on samples from any calibrated or strictly proper-trained denoising posterior model, obviating the need for closed-form densities.
  • Parameter Estimation and Calibration: The algebraic identity allows for the estimation of noise parameters (β,λ,Σ)(\beta, \lambda, \Sigma) by minimizing calibration error between independently estimated scores.
  • Noise-Adaptivity: The identity holds for arbitrary choices of noise parameters, enabling denoiser adaptation and flexible calibration for changing noise distributions.
  • Geometry: The framework illuminates the relationship between noise distribution geometry (induced by Σ\Sigma) and the energy landscape guiding denoising/vector fields. Figure 1

Figure 1

Figure 1: Score fields at a fixed early denoising stage (σ=0.8\sigma=0.8) for Gaussian (mean-seeking) and generalized Gaussian (median-seeking) noise showcasing the influence of the residual weighting.

Denoising and Generative Modeling Applications

The established identities unify and generalize denoising functionals. The optimal denoiser under MSE (for the Gaussian) and MAE-like (for Laplace/generalized Gaussian) regimes are shown to be conservative vector fields generated by gradients of potential functions dependent on the energy score. These fields are self-adjoint in the appropriate geometry (e.g., Mahalanobis).

Critically, the results enable the development of energy-score diffusion models:

  • Training: Models can be trained via matched (possibly heavy-tailed, anisotropic) energy score objectives at each time step, accommodating varying noise distributions.
  • Generation: The sampling procedure leverages Monte Carlo gradient approximations of the energy score to compute the Stein score for each step, which is then used in standard SDE/ODE-based generative pipelines, analogous to denoising score matching.

As a result, diffusion models can now formally support non-Gaussian or robust noise schedules (with explicit score computation), and posterior samplers such as Engression or Distributional Principal Autoencoders can supply the denoising conditional, even when trained using energy or distributional losses. Figure 2

Figure 2

Figure 2: Diffusion progress for both Gaussian (mean-seeking, left) and generalized Gaussian (median-seeking, right) noise models, tracking sample reconvergence from highly noised states to the target distribution.

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3: Mean energy distance to the clean data through denoising progress, comparing Gaussian and generalized Gaussian models, showing distinctive convergence characteristics reflective of each noise model’s target statistic (mean/median).

Experimental Validation

Empirical evaluation is conducted on the Eight Gaussians dataset, a canonical multimodal benchmark. The energy-score identity is empirically validated by comparing analytic and Monte Carlo-based scores across noise levels for both Gaussian and generalized Gaussian noise. The experimental setup includes:

  • Trained conditional models for each noise parameterization.
  • Annealed Langevin sampling using (estimated) energy score gradients.
  • Quantitative evaluation (MSE, cosine similarity) of the score field estimators, showing high directional accuracy across varying noise levels.
  • Visualization of denoising vector fields illustrating the geometrical and statistical differences between mean and median-seeking regimes. Figure 4

Figure 4

Figure 4

Figure 4

Figure 4: Comparison of estimated Stein score fields and ground truth for both noise models, highlighting the accuracy of MC-based estimation via energy-score differentiation.

Relation to Prior Work and Theoretical Outlook

This work unifies and extends classical Tweedie/Fisher identities from exponential-family/score-matching-based denoising and generative modeling to a more general class of noise, geometries, and scoring losses. New connections are established with modern developments in generative models (energy-based, distributional autoencoders), robust regression, and adaptive denoising. The derived identities go beyond previous approaches by enabling both analytic and sample-based score computations for arbitrary strictly proper scoring rules associated with the noise model, and by supporting parameter inference and calibration.

Potential future extensions include:

  • Further development of heavy-tailed or robust generative models using energy score diffusion under arbitrary elliptical noise.
  • Application to inverse problems, uncertainty quantification, or test-time adaptation under unknown or shifting noise.
  • Exploration of connections to learned Riemannian metrics and geometric flows for data manifolds.

Conclusion

The paper delivers a significant analytical generalization of Tweedie’s formula, connecting denoising, Stein scores, and energy scoring rules under elliptical noise. The results provide tools for theory and practice, covering estimation, generative modeling, and calibration in challenging noise regimes—enabling, for the first time, principled score computation and diffusion modeling for non-Gaussian, robust, and geometrically structured noise.

These theoretical advancements are substantiated by experiments demonstrating both the accuracy and practical advantages of the proposed MC-based score estimation and noise-adaptive generative modeling—the latter showing distinctive behaviors for mean- versus median-seeking denoising and generative processes. The unified perspective developed here stands to inform future developments in robust, adaptive, and geometry-aware generative modeling.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We found no open problems mentioned in this paper.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 34 likes about this paper.