Sustainable LSTM-Based Precoding for RIS-Aided mmWave MIMO Systems with Implicit CSI

Published 16 Sep 2025 in eess.SP, cs.IT, cs.LG, and math.IT | (2509.12658v1)

Abstract: In this paper, we propose a sustainable long short-term memory (LSTM)-based precoding framework for reconfigurable intelligent surface (RIS)-assisted millimeter-wave (mmWave) MIMO systems. Instead of explicit channel state information (CSI) estimation, the framework exploits uplink pilot sequences to implicitly learn channel characteristics, reducing both pilot overhead and inference complexity. Practical hardware constraints are addressed by incorporating the phase-dependent amplitude model of RIS elements, while a multi-label training strategy improves robustness when multiple near-optimal codewords yield comparable performance. Simulations show that the proposed design achieves over 90% of the spectral efficiency of exhaustive search (ES) with only 2.2% of its computation time, cutting energy consumption by nearly two orders of magnitude. The method also demonstrates resilience under distribution mismatch and scalability to larger RIS arrays, making it a practical and energy-efficient solution for sustainable 6G wireless networks.

Abstract PDF Upgrade to Chat

Summary

The paper presents an LSTM-based precoding method that achieves over 90% spectral efficiency compared to exhaustive search under practical hardware constraints.
The framework leverages implicit CSI from uplink pilots and a multi-label training strategy to significantly reduce computational latency and energy consumption.
Numerical results show robust performance with only 2.2% of ES computation time and scalability for large RIS arrays, paving the way for sustainable 6G deployments.

Sustainable LSTM-Based Precoding for RIS-Aided mmWave MIMO Systems with Implicit CSI

Introduction and Motivation

The paper addresses the challenge of efficient precoding in reconfigurable intelligent surface (RIS)-assisted millimeter-wave (mmWave) multiple-input multiple-output (MIMO) systems, focusing on sustainable and scalable solutions for 6G wireless networks. The core problem is the joint optimization of active beamforming at the base station (BS) and passive reflection at the RIS, under practical hardware constraints, to maximize spectral efficiency. Traditional approaches such as exhaustive search (ES) and alternating optimization (AO) are computationally prohibitive, especially when considering the phase-dependent amplitude response of RIS elements. The work proposes a long short-term memory (LSTM)-based deep learning (DL) framework that leverages uplink pilot sequences to implicitly learn channel characteristics, thereby bypassing explicit channel state information (CSI) estimation and significantly reducing both pilot overhead and inference complexity.

System Model and Problem Formulation

The considered system comprises a BS with $N_t$ antennas, a user with $N_r$ antennas, and a RIS with $N = N_h \times N_v$ passive elements arranged in a uniform planar array (UPA). The direct BS-user link is assumed blocked, so all communication is via the RIS. The RIS response is modeled as a diagonal matrix $\boldsymbol{\Psi}$ , where each diagonal entry is $\beta_n e^{j\psi_n}$ , with $\beta_n$ capturing the phase-dependent amplitude response, a critical hardware constraint.

The optimization objective is to maximize the downlink spectral efficiency: $\max_{\boldsymbol{F}, \boldsymbol{\Psi}} \log_2 \left| \boldsymbol{I}_{N_s} + \frac{P}{\sigma^2 N_s} \boldsymbol{H}_{\text{eff}}^H \boldsymbol{F} \boldsymbol{F}^H \boldsymbol{H}_{\text{eff}} \right|$ subject to transmit power and RIS hardware constraints. The effective channel $\boldsymbol{H}_{\text{eff}}$ is the cascaded BS-RIS-user channel. The non-convexity arises from the nonlinear amplitude–phase coupling in $\boldsymbol{\Psi}$ and the discrete nature of codebook-based selection.

To make the problem tractable, the RIS configuration is selected from a Kronecker-structured DFT codebook that quantizes azimuth and elevation angles. The codebook is adjusted to account for the practical amplitude–phase model, ensuring hardware fidelity.

LSTM-Based Precoding Framework

Input Representation and Preprocessing

Instead of explicit CSI, the framework uses uplink pilot sequences received at the BS under a fixed reference RIS state. This design avoids the circularity problem where the input could implicitly encode the target RIS configuration. The pilot matrix is preprocessed by separating real and imaginary parts and arranging them into a real-valued input vector suitable for neural network ingestion.

Model Architecture

The LSTM-based model consists of:

Input layer: Receives the preprocessed pilot vector.
Two LSTM layers: Each with 140 neurons and tanh activation, capturing temporal dependencies in the pilot sequence.
Two fully connected layers: With 200 and 100 neurons, batch normalization, and Leaky ReLU activation.
Output layer: $N$ neurons (codebook size), with Sigmoid activation for multi-label classification.

Multi-Label Training Strategy

Recognizing the non-convexity of RIS optimization, the model is trained using a multi-label approach. Instead of a single optimal codeword, all codewords within 0.5 dB of the ES optimum are treated as positive labels. This mitigates overfitting to a single solution and enhances robustness to distribution mismatch, as multiple near-optimal configurations often exist due to channel randomness.

Loss Function and Optimization

The model is trained with a cross-entropy loss function and optimized using Adam. One-hot encoding is used for multi-label targets, and a Sigmoid threshold determines which codewords are selected during inference.

Complexity and Scalability

The LSTM inference requires only a single forward pass with complexity $\mathcal{O}(K \cdot H)$ , where $K$ is the pilot length and $H$ is the hidden layer size. This is a substantial reduction compared to ES ( $\mathcal{O}(Q \cdot N_t N_r)$ ) and AO ( $\mathcal{O}(I \cdot Q \cdot N_t N_r)$ ), enabling real-time operation even for large RIS arrays.

Numerical Results and Analysis

Spectral Efficiency and Latency

The LSTM-based approach achieves over 90% of the spectral efficiency of ES (specifically, 90.6%–92.2% under practical RIS models), with only 2.2% of ES computation time (0.59 ms vs. 26.75 ms per realization).
Compared to AO, LSTM reduces latency by over 97% while maintaining a performance gap of less than 3%.
The CNN baseline is consistently outperformed by LSTM, with LSTM achieving up to 35.8% higher spectral efficiency.

Impact of Multi-Label Training

Multi-label LSTM training yields 7.8%–29% improvement in spectral efficiency under ideal RIS models and 3.8%–16.1% improvement under practical models compared to single-label training.
The approach is robust to distribution mismatch, maintaining above 88% of ES performance when propagation parameters are perturbed by ±20%.

Energy Efficiency

The LSTM-based method reduces inference energy consumption by nearly two orders of magnitude compared to ES or AO, aligning with the objectives of Green AI and sustainable 6G deployments.

Scalability

The computational advantage of LSTM becomes more pronounced as the RIS array size increases, due to the linear scaling of codebook size with $N_h \times N_v$ in ES/AO, versus the fixed complexity of LSTM inference.

Practical and Theoretical Implications

The proposed framework demonstrates that sustainable, real-time precoding for RIS-assisted mmWave MIMO systems is feasible without explicit CSI estimation. By leveraging implicit CSI from pilot sequences and a multi-label LSTM architecture, the approach achieves near-optimal spectral efficiency with drastically reduced computational and energy costs. The explicit modeling of phase-dependent amplitude ensures hardware compatibility, a critical aspect often neglected in prior DL-based RIS optimization works.

Theoretically, the work highlights the importance of multi-label learning in non-convex wireless optimization problems, where multiple near-optimal solutions exist. The empirical results confirm that this strategy improves generalization and robustness, especially under distribution shifts.

Future Directions

Potential extensions include:

Multi-user and high-mobility scenarios, where the scalability and robustness of the LSTM framework can be further validated.
Online adaptation and continual learning to handle rapidly changing environments.
Integration with other sustainable AI techniques, such as model compression and quantization, for further energy savings.
Hardware-in-the-loop experiments to validate real-world deployment feasibility.

Conclusion

This work presents a sustainable LSTM-based precoding framework for RIS-assisted mmWave MIMO systems, achieving over 90% of ES spectral efficiency with only 2.2% of its computation time and nearly two orders of magnitude lower energy consumption. The approach is robust to distribution mismatch, scalable to large RIS arrays, and compatible with practical hardware constraints, making it a viable solution for real-time, energy-efficient 6G wireless networks. The multi-label training strategy is particularly effective in addressing the non-convexity and ambiguity inherent in RIS optimization. Future research should explore multi-user extensions and real-world deployment scenarios.

Markdown Report Issue