Joint Compression–Denoising Models

Updated 13 January 2026

Joint compression–denoising models are frameworks that merge noise suppression and bit-rate minimization by optimizing a composite rate–distortion objective.
They employ architectures such as autoencoders, transformers, and vector quantization to allocate coding resources to signal content while discarding noise.
Recent advances demonstrate improved rate–distortion performance, computational efficiency, and robustness across domains like image, signal, and prompt coding.

Joint compression–denoising models fuse noise suppression and bit-rate minimization into unified optimization frameworks, addressing the inherent inefficiency and error propagation of traditional sequential pipelines for image, signal, and prompt coding. These methods exploit the non-compressibility of noise, learning representations that allocate rate preferentially to signal content while discarding or attenuating noise, and in specific imaging contexts (e.g., raw sensor data, multi-layer representations, LLM prompts) achieve superior rate–distortion tradeoffs, computational efficiency, and robustness.

1. Conceptual Principles and Motivation

Joint compression–denoising paradigms are predicated on the observation that noise, while present in real-world signals and images, is both undesirable and intrinsically hard to encode efficiently. Sequential systems that denoise first then compress, or vice versa, waste bits on representing noise and suffer information loss between stages (Brummer et al., 15 Jan 2025, Brummer et al., 2023, Zhang et al., 2024). These models instead minimize a composite rate–distortion objective, typically of the form: $L(\theta) = \mathbb{E}_{(y,x)}[D(x, \hat{x}_\theta(y)) + \lambda R_\theta(y)]$ where $y$ is a noisy observation, $x$ clean ground truth, $D$ a distortion metric (e.g., multi-scale SSIM, MSE), $R_\theta$ the coded rate under a learned prior, and $\lambda$ balances perceptual fidelity versus bit-rate (Brummer et al., 15 Jan 2025, Xie et al., 2024, Cheng et al., 2022).

Critical insights include:

Rate–distortion learning: By integrating denoising into the representer space subject to rate constraints, models "spend" entropy only on signal, implicitly denoising as part of compression (Cai et al., 2024).
Cross-domain generalization: Domain adaptation, e.g. sensor calibration and color space normalization, enables joint models to generalize to unseen types (Brummer et al., 15 Jan 2025, Brummer et al., 2023).
Adaptive quantization and context modeling: Content-adaptive quantizers and hierarchical entropy models further enhance both coding efficiency and denoising (Zhang et al., 2024).

2. Representative Architectures and Methodologies

The field encompasses multiple classes of joint models, differentiated by operational domain (raw sensor, RGB, feature-space), architectural primitives (autoencoders, transformers, operational neural networks, vector quantization), and optimization mechanics.

Table: Key Joint Compression–Denoising Model Classes

Model Type	Key Mechanism	Domain
Raw-Domain JDDC (Brummer et al., 15 Jan 2025)	AE+hyperprior, PixelShuffle	Bayer mosaic, camRGB
Contrastive Self-ONN (Xie et al., 2024)	Contrastive, Self-ONNs	RGB, feature-space
FLLIC (Zhang et al., 2024)	Denoising AE, CA quant	RGB, noise-aware
Residual Quantization (Ferdowsi et al., 2017)	VQ cascade, regularization	global vector space
SNR-aware transformer (Cai et al., 2024)	Local/non-local fusion	RGB, SNR-mapped
Latent scalability (Alvar et al., 2022)	Latent splitting, AE	RGB, base/enhancement
Prompt denoising (You et al., 2024)	SLM + DRL, multi-stage	LLM prompt text

Autoencoder+hyperprior frameworks (e.g., JDDC (Brummer et al., 15 Jan 2025), FLLIC (Zhang et al., 2024), JDC (Brummer et al., 2023), SNR-aware (Cai et al., 2024)) leverage learned probability models in latent space to drive both denoising and entropy coding. Feature-level denoising (Self-ONNs, CBAM, plug-in residual blocks) as well as contrastive loss to differentiate signal and noise are common (Xie et al., 2024, Cheng et al., 2022).

Vector quantization cascades (RRQ (Ferdowsi et al., 2017)) exploit codebooks with reverse-water-filling regularization on clean data, effecting denoising by forcefully projecting noisy inputs toward clean-image manifolds during reconstruction.

Latent scalability frameworks (JICD (Alvar et al., 2022)) enable selective decoding of denoised or full noisy images by separating latent channels into base and enhancement layers.

Transformer-based approaches incorporate latent refinement modules and prompt generators to adapt decoders to denoising tasks with minimal additional overhead and computational cost (Chen et al., 2024). SNR-aware methods allocate representational and denoising capacity regionally, guided by local SNR maps (Cai et al., 2024).

3. Training Regimens and Dataset Strategies

Effective joint models require training on both noisy/clean pairs and clean-only examples, sometimes stratified by noise level. Techniques include alignment by L1 minimization, sampling patches, and diversity via sensor/color matrix augmentation (Brummer et al., 15 Jan 2025, Brummer et al., 2023). Feature guidance branches, contrastive loss against clean features, and data balancing are used to stabilize and maximize generalization.

RawNIND (Brummer et al., 15 Jan 2025): 310 scenes, multiple sensor types, including withheld "unknown-sensor" test. Patch augmentation and conversion to Rec. 2020 enable cross-sensor generalization (≤0.01 MS-SSIM drop).
NIND (Brummer et al., 2023): Noisy/clean ISO pairs, stratified into mild/moderate/strong noise; supplemental clean-only images to avoid over-denoising artifacts.
Flicker2W, SIDD, CLIC: Used for high-volume training/testing under synthetic or real noise, with fine-tuning and ablation on guidance/SNR branches (Xie et al., 2024, Cai et al., 2024, Cheng et al., 2022, Zhang et al., 2024).

4. Quantitative Performance and Computational Efficiency

Joint models consistently yield superior rate–distortion curves compared to both standard codecs and sequential denoise-then-compress pipelines across synthetic and real noise regimes.

Raw-domain JDDC (Brummer et al., 15 Jan 2025): Bayer JDDC nets outperform linear RGB JDC by 0.02–0.05 MS-SSIM at the same bpp and achieve 3.5× total complexity reduction over sequential pipelines. On unknown sensors, MS-SSIM remains high (~0.876).
Contrastive Self-ONN (Xie et al., 2024): BD-rate savings reach −23.8% (Kodak, PSNR level 4), denoising performance holds across noise levels, with encoding time reductions of 3–6%.
FLLIC (Zhang et al., 2024): Achieves best-in-class bpp vs. PSNR, cutting bit rate by ≈1 at matched quality and inference time by an order of magnitude over cascades.
JDC (Brummer et al., 2023): 15–30% BD-rate savings, up to 1 order-of-magnitude lower GMac operations than cascade.
SNR-aware (Cai et al., 2024): Network outperforms sequential, prior joint, and pure compression baselines on Kodak/CLIC/SIDD. Ablation shows both guidance and SNR fusion materially boost final PSNR, especially at high noise.

5. Special Algorithms, Losses, and Theoretical Insights

Several models formalize compression-denoising as a joint Lagrangian problem trading rate, distortion, and fidelity. For instance:

$\hat x = \arg\min_x \{-\log P(y|x) - \log P(x) + \lambda R(x)\}$

used for contour string denoising/compression (Zheng et al., 2017). Total Suffix Trees for context modeling enable dynamic programming approaches to scale optimally.

In information-theoretic settings, rate–distortion is matched to channel log-likelihoods, yielding denoisers whose risk quantifies as the expected loss of independent posterior samples—often approaching Bayes-optimal risk for MSE and Hamming loss (Song et al., 16 Dec 2025).

Contrastive learning, multi-scale Self-ONNs, CBAM attention blocks, and context-adaptive quantization are recurrent features enabling disentangling of noise from critical signal components, further improving both rate and perceptual quality (Xie et al., 2024, Cheng et al., 2022, Zhang et al., 2024).

6. Extensions, Practical Applications, and Limitations

Joint compression–denoising models extend beyond imagery into signal processing and LLM prompt optimization (You et al., 2024). Multi-step, denoising-inspired compression improves transmission efficiency and fidelity for LLM services, leveraging reinforcement learning and domain-specific transformers.

Common limitations include need for explicit per-noise-level training (FLLIC), potential overhead in low-noise situations, computational/memory constraints on large models (multi-scale ONNs), and challenges generalizing to non-standard distortions or imaging modalities (Zhang et al., 2024, Xie et al., 2024).

Further research targets domain adaptation, scalable architectures for multi-noise and multi-modality, lightweight deployments, and unified frameworks that capture both forensics and high-fidelity requirements.

7. Current Trends and Future Directions

The evolution of joint compression–denoising models highlights several directions:

Raw-domain pipelines: Increasing computational and rate–distortion efficiency by operating directly on sensor-native inputs (Brummer et al., 15 Jan 2025).
Contrastive and feature-space denoising: Enhanced separation of noise and structure, with generalization across synthetic and real noise.
Adaptive, spatially-aware encoding: SNR-guided local/non-local fusion architectures for robust performance under highly heterogeneous noise.
Latent scalability: Flexible transmission/decode workflows allowing selective recovery of clean or noisy signal data (Alvar et al., 2022).
Hybrid multimodal systems: Prompt compression and transmission in LLMs with DRL-based optimization and multi-step denoising (You et al., 2024).

Ongoing research is focused on improving generalization to unseen domains, integrating with next-generation sensor hardware, and harmonizing rate–distortion and task-based metrics for application-driven coding efficiency.

The literature demonstrates that joint models for compression and denoising, when designed with principled rate–distortion objectives, hierarchical priors, and domain adaptation strategies, present a robust solution to the inefficient coding of noisy data, achieving marked gains in both bandwidth reduction and restoration quality across applications and modalities (Brummer et al., 15 Jan 2025, Xie et al., 2024, Zhang et al., 2024, Brummer et al., 2023, Ferdowsi et al., 2017, Alvar et al., 2022, Cai et al., 2024, Song et al., 16 Dec 2025, You et al., 2024, Cheng et al., 2022).