CSI Adaptive Network (CsiAdaNet)

Updated 23 January 2026

CSI Adaptive Network (CsiAdaNet) is a deep learning framework that enhances CSI acquisition, quantization, and feedback in massive MIMO systems by leveraging spatial, temporal, and contextual correlations.
It employs a hierarchical architecture with spatial-frequency encoding, temporal aggregation via GRU, and multi-head quantized predictors to optimize precoder estimation and beamforming.
The framework achieves improved NMSE and reduced CSI-RS overhead through methods like port-cycling and adaptive quantization, making it scalable for 6G and XL-MIMO applications.

The CSI Adaptive Network (CsiAdaNet) refers to a class of deep learning-based frameworks designed to improve the acquisition, quantization, feedback, and reconstruction of Channel State Information (CSI) in large-scale and environment-adaptive MIMO wireless systems. By exploiting spatial, temporal, and contextual correlations, such networks enable significant reductions in resource overhead for pilot transmission and CSI feedback, while maintaining high precoder estimation accuracy, adaptability across environments, and scalability to future 6G massive MIMO architectures.

1. Motivation and System Framework

Accurate CSI acquisition is critical for spectral efficiency in massive MIMO systems, yet the resource overhead for CSI reference signals (CSI-RS) scales linearly with the number of transmit ports ( $N_t$ ). As future wireless systems increase $N_t$ , transmitting CSI-RS for all ports at every time-frequency location becomes prohibitive, limiting data throughput. CsiAdaNet was introduced as a DL-based reconstruction and feedback architecture that seeks to reduce instantaneous CSI-RS and feedback overhead, preserve channel observability over time, and enable reliable precoder estimation for advanced beamforming and spatial multiplexing (Arun et al., 21 Jan 2026).

A central methodology is port-cycling: the $N_t$ ports are partitioned into $p$ sub-arrays, each sounded in sequence over $p$ pilot instants, yielding an overhead fraction $1/p$ per instant (e.g., for $p=4$ , overhead is reduced to $25\%$ ). This is complemented by sparse sampling, codebook-based beamforming, and temporal aggregation.

2. CsiAdaNet DL Architecture

CsiAdaNet employs a hierarchical deep neural architecture tailored for sparse port-cycled measurements and temporal channel evolution. The system ingests $p$ sparse CSI tensors (real and imaginary parts stacked), one per sounded sub-array, encompassing dimensions $N_x', N_y', N_r, N_{sc}$ .

Spatial-frequency encoder: Three 3D convolutional layers with 16, 32, 64 filters, kernel size $3\times3\times3$ , followed by global average-pooling and a dense layer with LayerNorm, mapping inputs to latent code $Z_t\in\mathbb{R}^{32}$ .
Temporal aggregator: Latent codes $\{Z_{t+iP}\}_{i=0}^{p-1}$ are passed to a single-layer GRU (hidden size 32), outputting a temporally-aggregated feature $\tilde{Z}\in\mathbb{R}^{32}$ .
Multi-head predictor: Outputs include the Type-II codebook (beamset $B$ , beam indices $I$ , amplitude $P$ , phase $\Phi$ ), using softmax activations over oversampled grids and quantized levels.

The full mathematical flow comprises:

$Z_t = f_{\text{enc}}(H_t;\theta)$ for each sounding,
$\tilde{Z} = f_{\text{gru}}([Z_t,\ldots,Z_{t+(p-1)P}];\phi)$ ,
Final precoder for layer $v$ , $W^{(v)}$ .

This structure directly leverages spatial correlations (3D convs) and temporal relationships (RNN/GRU) for CSI reconstruction.

3. Training, Quantization, and Adaptivity

CsiAdaNet models are trained using weighted sparse-categorical cross-entropy (SCCE) losses for each head: $L(\theta) = \alpha_B \ell_B + \alpha_I \ell_I + \alpha_P \ell_P + \alpha_\Phi \ell_\Phi$ , with empirically-tuned weights (e.g., $\alpha_I=1.5$ , others $\sim$ 0.75–1.25).

Datasets typically include 30,000 channel realizations (split $70\%$ : $20\%$ : $10\%$ ) with SNR augmentation via uniform sampling in $\{-5, 0, 5, 10, 15\ \text{dB}\}$ , designed to ensure robustness. Optimization uses Adam (lr $=10^{-3}$ ), batch size 256, up to 250 epochs with early stopping and automated learning rate decay.

For quantization and adaptive feedback, several CsiAdaNet modes exist:

Nested Dropout (ND): Rank latent dimensions by information, allocate quantization bits via greedy minimization of post-k-means distortion, then transmit closest centroids.
Vector Quantization (VQ-VAE): Partition code into sub-vectors, jointly learn codebooks. Encoder output sub-vectors are replaced by nearest codewords, striving for optimal rate-distortion.

Both methods achieve lower normalized MSE compared to uniform quantization, improving NMSE by 1–1.5 dB (e.g., $-9.5$ to $-10$ dB at $B=512$ bits for VQ-VAE) (Rizzello et al., 2022). Adaptivity is fundamentally offline, learning channel statistics from uplink samples and encoding optimal bit allocations or codebooks to MTs.

4. Port-Cycling, Environment, and Feedback Overhead

Sub-array partitioning operates on uniform planar arrays (UPA) of size $N_x \times N_y \times N_{\text{pol}}$ : splitting into $p=P_x \times P_y$ contiguous sub-panels so that only $1/p$ of ports are sounded per pilot. Over $pP$ ms, every port is sampled at least once.

Table: Overhead/Accuracy Trade-off (SGCS at 10 dB SNR):

$p$	Overhead $1/p$	SGCS@10 dB
1	1.00	0.75 (baseline)
2	0.50	0.72
4	0.25	0.68

Increasing $p$ lowers overhead but slightly increases reconstruction error.

In image transmission via JSCC-MIMO, CsiAdaNet adaptively selects the feedback bit budget $B$ per sample, based on predicted post-reconstruction PSNR. A simple piecewise-linear function allocates $B$ so that samples with low predicted PSNR receive maximal overhead, and robust samples are transmitted with minimal CSI feedback (Zhang et al., 2023). For the CIFAR-10 example, this adaptive scheme reduces average feedback by $\sim$ 24% at fixed outage rates.

5. CSI Feedback Generalization and Environment Adaptivity

CsiAdaNet variants such as AdapCsiNet integrate scene graph–aided hypernetworks to achieve context-aware adaptation of CSI decoders. Static/fine-grained environmental knowledge (walls, zone boundaries, scatterers) is encoded as a graph, discretized and embedded before convolutional processing.

Hypernetwork $f_\phi: G\rightarrow(W_{\text{adapt}}, b_{\text{adapt}})$ produces additive decoder weight/bias “deltas” for the CSI reconstruction path.
UE compresses CSI with a fixed random projection; BS uses fused weights for decoder: base path $h_1 = W_{\text{base}}s + b_{\text{base}}$ ; adaptation path $h_2 = \tanh(W_{\text{adapt}}s + b_{\text{adapt}})$ ; output $h_{\text{input}} = h_1 + \alpha h_2$ , then convolutional refinement.

Two-step training (base first, then hypernetwork with frozen base) yields up to 46.4% NMSE improvement at $R=1/24$ compression compared to mixed-data non-adaptive decoders (Liu et al., 15 Apr 2025). This approach matches online retraining performance but incurs no runtime data collection or retraining.

6. Performance Analysis and Computational Efficiency

Typical simulation setups employ 3GPP settings: UMa LoS, $N_t=128$ at the BS, $N_r=4$ at UE, $f_c=3.5$ GHz, $BW=10$ MHz, with $p=4$ and CSI-RS periodicity $P=20$ ms.

Evaluation metrics:

Squared Generalized Cosine Similarity (SGCS): $|v_{\text{eig}}^H w|^2$ , for eigen/preferred and reconstructed beams.
Beamforming Gain (BF): $G(w)=\frac{w^H R w}{v_{\text{eig}}^H R v_{\text{eig}}}$ .

Reported findings:

SGCS saturates near $0.7$ at $\geq10$ dB SNR; matches full-port EVD closely except at lowest SNR.
Individual head accuracies: beamset $>90\%$ at $>0$ dB, beam indices $\sim$ 80\% at $>5$ dB, amplitudes $\sim$ 70\% at $>5$ dB, phase $\sim$ 60\% at $>10$ dB.
BF gain gap remains $<0.03$ (medium-high SNR).
Computationally, full-array EVD scales as $O(N_t^3)$ , CsiAdaNet inference is $O(pD^2)$ (D=latent size), independent of $N_t$ for large arrays.

WideNLNet-CA (for near-field wideband XL-MIMO) further extends CsiAdaNet principles, introducing compression-ratio adaptive modules (CAM) to dynamically select important features/channels for serializing feedback, with a single-model approach serving multiple feedback rates (Liu et al., 1 Aug 2025). NMSE improvements up to $6$ dB over baselines, fast inference ( $\sim$ 1.6 ms for $1.5$M parameters), and low memory footprint.

7. Limitations, Open Challenges, and Prospects

CsiAdaNet frameworks require accurate knowledge of environment/topology for full adaptation benefits; scene graph construction is nontrivial in dynamic settings. Robustness to channel statistic drift, compatibility with hierarchical or online codebook learning, and scalability to large outdoor scenarios represent ongoing challenges.

Potential avenues involve hierarchical quantization, multi-task UL/DL training, further sparsification for ultra-dense arrays, and embedding real-time sensor-based environment updates. The paradigm enables resource-efficient CSI acquisition with strong generalization, forming a foundation for CSI feedback in next-generation cellular systems.

Key References:

"Deep Learning assisted Port-Cycling based Channel Sounding for Precoder Estimation in Massive MIMO Arrays" (Arun et al., 21 Jan 2026)
"AdapCsiNet: Environment-Adaptive CSI Feedback via Scene Graph-Aided Deep Learning" (Liu et al., 15 Apr 2025)
"Deep Learning-Based Rate-Adaptive CSI Feedback for Wideband XL-MIMO Systems in the Near-Field Domain" (Liu et al., 1 Aug 2025)
"Learning Representations for CSI Adaptive Quantization and Feedback" (Rizzello et al., 2022)
"Adaptive CSI Feedback for Deep Learning-Enabled Image Transmission" (Zhang et al., 2023)