Coordinate-Conditioned Denoising Diffusion Model
- CCDDPM is a generative framework that produces high-fidelity, scene-consistent radio environment maps conditioned on transmitter coordinates.
- It fuses a coordinate-derived Gaussian prior with Gaussian noise in a two-channel U-Net to accurately emulate dynamic 6G vehicular radio conditions.
- The model outperforms traditional methods by delivering real-time REM synthesis with low uncertainty and preserved spatial details essential for C-V2X communications.
A Coordinate-Conditioned Denoising Diffusion Probabilistic Model (CCDDPM) is a generative framework designed to synthesize high-fidelity, scene-consistent radio environment maps (REMs) for 6G Cellular Vehicle-to-Everything (C-V2X) communications as a function of arbitrary transmitter vehicle coordinates. In this architecture, conditional generative diffusion processes are leveraged to predict dense two-dimensional received-power fields, enabling rapid and accurate emulation of radio propagation environments under dynamic vehicular transmitter locations, without recourse to exhaustive measurement campaigns or computationally expensive ray tracing. CCDDPM achieves this by fusing a smooth, coordinate-derived Gaussian prior with Gaussian noise inside a lightweight, two-channel conditional U-Net, which is trained end-to-end to reverse the noise process and recover credible REM samples conditioned on spatial context (Cao et al., 27 Dec 2025).
1. Motivation and Background
The 6G C-V2X paradigm necessitates dynamic, fine-grained REMs, denoted , to support real-time communication reliability, handover, and scheduling as vehicles traverse urban environments. Traditional REM acquisition is labor-intensive, relying on active drive tests or computationally intensive ray-tracing models. The spatially non-IID nature of radio propagation—driven by continuous vehicular movement and complex urban architectures—renders classical generative strategies (e.g., VAE, GAN, NF) insufficient for capturing both global patterns and transmitter-centric local effects. CCDDPM addresses this challenge by learning the conditional distribution , where denotes the transmitter coordinate, enabling synthesis of REMs for any queried transmitter location in a given region. This approach provides statistical and structural fidelity while supporting real-time deployment on edge hardware (Cao et al., 27 Dec 2025).
2. Mathematical Foundations
The CCDDPM framework generalizes the denoising diffusion probabilistic model to a coordinate-conditioned setting:
- Forward (Noising) Process
- Initiated from a clean REM , Gaussian noise is added in steps:
- where is a predetermined noise schedule; ; cumulative .
- The process admits a closed-form sampling at any step:
- Reverse (Denoising) Process
- A parameterized conditional model 0 in Gaussian form:
1 - Mean prediction is cast in terms of the injected noise 2:
3
- Optimization Objective
- The evidence lower bound (ELBO) on 4 is decomposed into denoising sub-losses:
5 - The overall training loss averages over 6:
7
3. Coordinate Conditioning Mechanism
The transmitter coordinate, 8, is represented as a 2D Gaussian heatmap over the REM grid:
9
with 0 controlling Gaussian spread. This spatial prior is normalized to 1, then concatenated with the noisy REM 2, yielding a two-channel input 3 for the denoising network. This explicit fusion anchors the generative process to the physical transmitter position throughout all denoising stages.
4. Model Architecture
The Denoiser adopts a lightweight two-channel U-Net architecture with temporal and conditional modulation:
- Inputs: Two-channel tensor 4, timestep embedding 5
- Time Embedding: Sinusoidal function 6 mapped via an MLP to 7; at each block, 8 is used in FiLM (Feature-wise Linear Modulation) format:
9
- Encoder: Repeats blocks of Conv–GroupNorm–SiLU, with FiLM-modulated residuals and 0 downsampling; channel progression typically 1; one multi-head self-attention block at 2.
- Bottleneck: Two residual blocks interleaved with self-attention.
- Decoder: Upsampling by 2, concatenating encoder features via skip connections, followed by FiLM-modulated residuals, mirroring the encoder to reach the single-channel output.
- Output: 3 convolution to predict 4.
5. Training and Inference Procedures
The CCDDPM training and sampling pipelines proceed as follows:
| Stage | Procedure | Output |
|---|---|---|
| Training | 1. Sample REM mini-batch 5, coordinates 6 | 7, 8 |
| 2. Build, normalize 9 | 0 | |
| 3. Sample 1, 2 | 3, 4 | |
| 4. Form 5 | 6 | |
| 5. Predict 7 | 8 | |
| 6. Compute loss 9, update 0 | model update | |
| Inference | 1. Given 1, build 2 | 3 |
| 2. Initialize 4 | 5 | |
| 3. For 6: | ||
| (a) 7 | ||
| (b) 8 | 9 | |
| (c) 0; 1; 2 | 3 | |
| 4. Return 4, remapped to original signal-strength scale | Synthesized REM |
This process supports few-step samplers (e.g., DDIM, DPMSolver with 5) for efficient, real-time REM synthesis on edge GPUs (Cao et al., 27 Dec 2025).
6. Experimental Evaluation
Evaluation was conducted on 900 training and 100 test REMs (6 m resolution):
- Distributional Fidelity: At coordinate (108,178), the CDF of signal intensity from 100 generated REMs matches the empirical CDF closely, surpassing normalizing flows, GAN, and VAE.
- Sampling Variance: Standard-deviation envelope over 100 samples is minimum for CCDDPM, indicating stability.
- Line-slice RMSE: At 7 and 8, CCDDPM minimizes mean RMSE and error bars, especially at building edges and rapid spatial transitions.
- Qualitative Structure: REMs synthesized via CCDDPM preserve hotspots, shadowing, and structural features (e.g., building outlines); alternatives over-smooth, introduce artifacts, or drift in contrast.
- Runtime: Benefiting from lightweight U-Net and compatibility with fast samplers, CCDDPM enables real-time REM generation on edge GPUs.
7. Broader Implications and Impact
Conditional Gaussian priors on transmitter coordinates allow CCDDPM to accurately anchor localized high-power zones and model both global and local radio propagation structures. Key outcomes include:
- High-fidelity REM Prediction: Preservation of spatial fine structure critical for radio planning and adaptive vehicular communication.
- Consistent, Low-Uncertainty Samples: Repeatability and stability facilitate targeted active re-measurement in regions of highest epistemic uncertainty.
- Operational Utility: On-the-fly REM inference supports enhanced PHY reliability, adaptive scheduling, and efficient handover for 6G C-V2X, eliminating the need for pervasive measurements.
A plausible implication is that this approach generalizes to other spatially conditioned generative tasks involving continuous context variables and complex scene structures. Coordinate conditioning thus converts generic diffusion into an efficient, scene-consistent REM predictor underpinning robust radio-map services for next-generation vehicular networks (Cao et al., 27 Dec 2025).