BeamCKMDiff: Beam-Aware CKM Generation
- BeamCKMDiff is a generative framework that leverages a diffusion process in a VAE latent space to synthesize continuous, beam-aware channel knowledge maps.
- It employs an advanced Diffusion Transformer backbone with adaptive layer normalization to integrate beam and environmental context into the generative process.
- The method achieves state-of-the-art NMSE performance with sub-second inference, advancing environment-aware channel mapping for scalable 6G network planning.
BeamCKMDiff is a generative framework for constructing high-fidelity, beam-aware channel knowledge maps (CKMs) from environmental context and continuous beamforming vectors in wireless communication scenarios. It is designed to address the limitations of conventional CKM construction methods, which typically rely on sparse sampling measurements, omnidirectional map assumptions, or discrete codebook representations. BeamCKMDiff enables the generation of channel knowledge maps conditioned on arbitrary continuous beamforming vectors without requiring site-specific measurement data, which is critical for the realization of environment-aware 6G networks (Zhao et al., 15 Jan 2026).
1. Conditional Diffusion Process in VAE-Latent Space
BeamCKMDiff synthesizes channel knowledge maps by conditioning a powerful diffusion generative process in the latent space of a variational autoencoder (VAE). The target variable is the VAE latent , which encodes a normalized map of site-specific radio signal strengths for a given beam.
Forward (Noising) Process
The forward process progressively adds Gaussian noise to the clean latent through time steps to :
where .
Reverse Process and Score Network
The reverse (denoising) process is parameterized by a neural network that predicts the noise at each step, conditioned on environment and beamforming vector:
with mean
Training Objective
BeamCKMDiff adopts the denoising-score matching loss typical in diffusion models: with sampled as
2. Diffusion Transformer Backbone and Conditioning Mechanism
BeamCKMDiff utilizes a variant of the Diffusion Transformer (DiT) as its score network, specifically adapted for spatial-conditional map synthesis and continuous beam embedding.
Architecture Overview
- Transformer blocks: 12 layers, input length (1616 grid tokens), embedding dimension .
- Attention: 8 heads per block.
- Patch Embedding: Input tensor is processed by a Conv2D patch embedder into , then flattened.
- Beam Embedding: The continuous beamforming vector , with , is split into real and imaginary parts (dimension 32), projected by an MLP to .
- Temporal Embedding: Diffusion step is similarly embedded by an MLP to .
The fused conditional embedding acts as a global control token.
Adaptive Layer Normalization (adaLN)
Conditioning is injected into every DiT block via adaLN, which modulates the scale () and shift () of standard LayerNorm by an affine transformation of :
All Multi-Head Attention and MLP sublayers in the DiT blocks use adaLN, enabling global steering of the generative process by the beamforming condition.
3. Data Representations and Preprocessing
BeamCKMDiff operates on spatial, beam, and environmental representations.
- Ground-truth CKM : real-valued map (in dB), computed as
CKMs are normalized by a VAE's sigmoid output and encoded into .
- Environment context: Building-height maps and transmitter masks , stacked and processed by a ResNet encoder to yield .
- Tokenization: The spatial tensor is patch embedded and flattened to a sequence of 256 tokens for Transformer input.
4. Training Protocol and Optimization
Dataset Generation
- 30 geo-referenced urban scenes (OpenStreetMap), each at pixel resolution.
- For each scene: 10 random GBS locations ( ULA, height), each considered under 10 random continuous beamforming vectors.
- Ground-truth maps computed using NVIDIA Sionna ray tracing with rays, up to 3 reflections/diffractions, at .
VAE Pretraining
Encoder and decoder trained to minimize
Diffusion Model Training
- noise steps, , (linear schedule).
- Frozen VAE; jointly optimize condition encoder and DiT denoiser using Adam optimizer ( learning rate) for several hundred epochs.
5. Evaluation, Baselines, and Quantitative Results
BeamCKMDiff is benchmarked against state-of-the-art approaches for CKM construction:
- RadioUNet: deterministic U-Net CKM regressor
- TransUNet: Transformer backbone with discrete beam-index embedding
- RadioDiff-UNet: U-Net diffusion model without adaptive normalization
Two evaluation protocols are used:
- Unseen beams: test on new beamforming vectors at previously seen GBS locations.
- Unseen locations: test on new transmitter locations and new beams.
The main metric is NMSE (dB), computed as
where is the predicted CKM, the ground truth, the building mask.
| Method | Unseen beams | Unseen locations | Inference time (s) |
|---|---|---|---|
| RadioUNet | −16.35 dB | −16.34 dB | 0.231 |
| TransUNet | −18.91 dB | −17.34 dB | 0.234 |
| RadioDiff-UNet | −19.49 dB | −19.13 dB | 0.373 |
| BeamCKMDiff | −21.24 dB | −20.68 dB | 0.688 |
BeamCKMDiff achieves the lowest NMSE (highest accuracy) across both settings. Visual inspection confirms that it accurately reconstructs both main-lobe and side-lobe structures under arbitrary continuous beam queries, outperforming baselines that either blur or misalign these salient features (Zhao et al., 15 Jan 2026).
6. Architectural and Methodological Significance
BeamCKMDiff introduces architectural and methodological advances:
- Continuous beam generalization: Rather than restricting to a pre-defined codebook, BeamCKMDiff is conditioned on arbitrary beamforming vectors, increasing flexibility for real-world deployments.
- adaLN-based global control: The adaptive layer normalization mechanism allows fine-grained, block-level modulation of the Transformer’s activations by the beam embedding, which is critical for capturing the non-trivial coupling between directionality and site-specific channel propagation.
- Diffusion process in learned latent space: Operating in a VAE-derived latent space both regularizes the generative process and enables use of compact, information-dense features.
- Sample and runtime efficiency: BeamCKMDiff achieves sub-second inference time for full map prediction, and, due to its generative efficiency, can synthesize CKMs for new beams or sites with no additional sampling.
A plausible implication is that these methodological elements are necessary for reliable beam-conditioned map generation at high fidelity and scale.
7. Context, Applications, and Integration
BeamCKMDiff represents a foundational tool for 6G network planning, environment-aware beam management, and site-specific channel database construction. It is directly applicable to settings where dense measurement-driven CKMs are infeasible to acquire, or when rapid adaptation to new beamforming vectors is required. By explicitly incorporating the environmental context and continuous beam control, BeamCKMDiff enables more granular, flexible, and accurate channel state information, which is critical for optimizing spatial reuse, beam selection, and CSI feedback compression in future wireless networks (Zhao et al., 15 Jan 2026).
In related simulation environments such as those for precision interferometry and diffraction (e.g., as discussed for beam decomposition and propagation in (Zhao et al., 2022)), analogous principles of combining continuous parameter control with generative map construction may inform broader diffraction toolkits. However, BeamCKMDiff is distinct in its focus on environmental radio propagation conditioned by arbitrary continuous beam directions, achieved via modern generative diffusion modeling.