Papers
Topics
Authors
Recent
Search
2000 character limit reached

Stacked Intelligent Metasurface (SIM)

Updated 29 January 2026
  • SIM is a multilayer programmable electromagnetic structure that performs analog computations on propagating waves via cascaded, subwavelength-controlled metasurfaces.
  • It enables real-time beamforming, spatial transforms, and wave-based computing for applications such as multi-modal communications and integrated sensing.
  • Advanced wave propagation models and gradient-based optimization techniques are used to achieve high energy efficiency and significant performance gains over single-layer designs.

A stacked intelligent metasurface (SIM) is a multilayer, programmable electromagnetic structure that performs linear—or more generally, analog computational—operations on propagating electromagnetic waves. By engineering the transmission (and possibly reflection) properties of each constituent metasurface layer at subwavelength resolution, SIMs enable ultra-fast, energy-efficient, and highly reconfigurable manipulation of EM fields for advanced tasks in communications, sensing, and computation. SIMs generalize the concept of single-layer reconfigurable intelligent surfaces (RIS), surpassing their functional limitations by leveraging cascaded free-space propagation and programmable meta-atom responses across multiple layers. The result is a device capable of realizing complex transfer matrices, including beamforming, spatial transforms, and direct analog information processing, at the physical layer and in real time.

1. Multilayer Physical Architecture and Wave-Domain Computing

A canonical SIM is realized as a serial stack of LL programmable metasurface layers, each consisting of an NN-element subwavelength array of meta-atoms aligned in the transverse (xxyy) plane. Each meta-atom provides an electronically tunable transmission coefficient zn(l)=an(l)ejϕn(l)z_n^{(l)} = a_n^{(l)} e^{j \phi_n^{(l)}}, with amplitude an(l)[0,1)a_n^{(l)} \in [0, 1) and phase ϕn(l)[0,2π)\phi_n^{(l)} \in [0, 2\pi), typically adjusted via varactor diodes or MEMS (Huang et al., 14 Jun 2025). The inter-layer separation dsd_s is on the order of a fraction of the carrier wavelength, typically ds=D/(L1)d_s = D / (L-1) for a total stack thickness DD (D10λD \sim 10\lambda at 28 GHz is common).

When an input electromagnetic field illuminates the structure, it is sequentially modulated (via each layer’s diagonal transmission matrix Z(l)Z^{(l)}) and diffracted (via free-space Green’s function matrices W(l+1)W^{(l+1)}) (Huang et al., 14 Jun 2025, An et al., 2023). The field vector at the ll-th layer evolves as

v(l)=W(l)Z(l1)v(l1),v^{(l)} = W^{(l)} Z^{(l-1)} v^{(l-1)},

with the overall transfer operator for the LL-layer SIM given by

B=Z(L)W(L)Z(L1)Z(1).B = Z^{(L)} W^{(L)} Z^{(L-1)} \cdots Z^{(1)}.

The output field at a receiver point mm is ym=hmTBw(1)y_m = h_m^T B w^{(1)}, where w(1)w^{(1)} encodes the feed coupling and hmh_m models the channel from output layer to receiver mm. The full architecture thus realizes a large-dimensional, trainable linear mapping—functionally analogous to a diffractive neural network, but implemented intrinsically by EM physics at the speed of light (An et al., 22 Jan 2026).

2. Mathematical Modeling and Optimization of SIMs

The design and configuration of SIMs are governed by a combination of analytical wave propagation models and gradient-based optimization techniques. Propagation between meta-atoms across layers is rigorously modeled by the Rayleigh–Sommerfeld diffraction kernel: wn,nˇl=Sajλdn,nˇlej2πdn,nˇl/λ,w^l_{n,\check{n}} = \frac{S_a}{j\lambda d^l_{n,\check{n}}} e^{-j2\pi d^l_{n,\check{n}}/\lambda}, where dn,nˇld^l_{n,\check{n}} is the Euclidean distance between atoms and Saλ2S_a \sim \lambda^2 is the atom area. Free-space propagation and in-layer modulation are thus cascaded.

The SIM is programmed for a target functionality—such as generating a spatial energy distribution at a receiver array—by minimizing a loss function, e.g.,

L({an(l),ϕn(l)})=1Mm=1M(ζym2Pmtg)2,\mathcal{L}(\{ a_n^{(l)}, \phi_n^{(l)}\}) = \frac{1}{M} \sum_{m=1}^M \left( \zeta |y_m|^2 - P_m^{\mathrm{tg}} \right)^2,

where PmtgP_m^{\mathrm{tg}} is the desired (e.g., binary edge map) pattern, and ζ\zeta is a normalization factor. Gradients of L\mathcal{L} with respect to each meta-atom’s parameters are computed via backpropagation through the linear chain of Z(l)Z^{(l)} and W(l)W^{(l)}. Training employs projected (mini-batch) gradient descent, with the feasible set projected onto 0a<10 \le a < 1, 0ϕ<2π0 \le \phi < 2\pi after each step (Huang et al., 14 Jun 2025, Huang et al., 2024).

For multiuser beamforming, the SIM’s phase profiles are optimized to synthesize user-orthogonal beams, with alternating optimization over power allocation and phase settings yielding locally optimal sum-rate performance: max{pk},{ϕn(l)}  k=1Klog2(1+γk),\max_{\{p_k\},\,\{ \phi_n^{(l)} \}} \; \sum_{k=1}^K \log_2( 1 + \gamma_k ), where γk\gamma_k is the SINR at user kk (An et al., 2023, An et al., 2023). Both analytic and deep reinforcement learning–based (e.g., DDPG actor-critic) strategies have been successfully employed for highly nonconvex settings (Liu et al., 2024).

3. Communication, Sensing, and In-Wave Computing Applications

SIMs have been adopted for a variety of EM-domain tasks:

  • Multi-Modal Semantic Communications: SIMs enable direct wave-domain imaging of visual semantic maps (such as edge patterns) while simultaneously transmitting textual semantic metadata via amplitude-phase modulations. A generative-adversarial model at the receiver fuses the SIM-imaged pattern and the textual description for scene reconstruction, yielding high-fidelity output with drastically reduced bandwidth compared to bitstream-based schemes (Huang et al., 14 Jun 2025).
  • Multiuser Beamforming and Holographic MIMO: SIMs perform multiuser MISO/MIMO beamforming and HMIMO channel diagonalization entirely in the analog domain, eliminating digital baseband computation and dramatically reducing the RF hardware count. System-level evaluations show sum-rate improvements of up to $2$–3×3\times over conventional hybrid schemes with similar hardware budgets (An et al., 2023, An et al., 2023, An et al., 2023).
  • Integrated Sensing and Communications (ISAC): Joint optimization realizes both communication (multiuser downlink) and radar (e.g., beampattern gain in a specified direction) tasks, using penalties to enforce sensing constraints while maximizing SE (Niu et al., 2024, Ranasinghe et al., 29 Apr 2025). Design trade-offs between beamforming DoF, computational complexity, and joint objective regularization are observed.
  • Wave-Based Computing (e.g., 2D DFT for DOA Estimation): SIMs physically implement spatial transforms such as the 2D DFT for direction-of-arrival estimation. Programmable inter-layer phases are adjusted to fit the SIM’s end-to-end transfer function to the DFT matrix, enabling real-time, optical-speed spatial spectrum computation at sub-dB mean-square error (An et al., 2023, An et al., 2024).
  • Task-Oriented Semantic Communications: An electromagnetic neural network (EMNN) realized via SIM performs source and semantic encoding jointly—all by diffractive propagation—enabling direct physical-layer image recognition with >90%>90\% test accuracy while omitting digital compression and baseband inference (Huang et al., 2024).

4. Performance Analysis and Practical Considerations

Quantitative Metrics

  • Pattern fidelity (MSE, SSIM): As SIM layer count LL increases (e.g., L=4L=4 to $10$), pattern generation error drops rapidly (MSE \sim $0.15$ to $0.02$) (Huang et al., 14 Jun 2025).
  • Sum-rate/channel capacity: SIMs routinely achieve $30$–200%200\% higher sum-rate than single-layer metasurface or digital-only precoders with matched hardware, and converge to within $1$ dB of fully-digital massive MIMO for L7L\gtrsim7 (An et al., 2023, An et al., 2023, An et al., 2023).
  • Convergence: Custom gradient or alternating optimization algorithms converge in $10$–$50$ iterations (phases, powers), with joint DRL-based methods stabilizing reward within 10410^4 steps (Liu et al., 2024, Liu et al., 2024).
  • Hardware reduction: SIM-based transceivers use only KK low-resolution RF chains and DACs for KK users, versus MKM \gg K in conventional architectures (An et al., 2023).

System and Implementation Insights

  • Aperture and layer design: There is a trade-off between the number of meta-atoms and layers (NN and LL) and achievable DoF, with saturation observed due to hardware and mutual coupling limits. For robust FDD or OFDM wideband communication, L=7L=7, N=100N=100 are typical peak values (Li et al., 1 Mar 2025).
  • Calibration and modeling: Accurate electromagnetic models (including inter-atom coupling and back-reflection) are essential for large-aperture, high-fidelity computing; multi-port network approaches outperform cascade approximations in the presence of strong coupling (Abrardo et al., 5 Jan 2025).
  • Robustness: SIMs have inherent resilience to moderate channel estimation errors and quantized phase control when properly normalized and trained (Huang et al., 14 Jun 2025, An et al., 2023).
  • Energy and hardware efficiency: Wave-domain analog processing at light speed eliminates digital latency (nanosecond-scale), lowers total power and thermal footprint, and allows for highly scalable architectures (Renzo, 2024, An et al., 22 Jan 2026). Hybrid active/passive partitioning further boosts gain (Iudice et al., 28 Jan 2026).

5. Comparison with Single-Layer Metasurfaces and Other Analog Devices

SIMs extend RIS and metasurface lens principles in both spatial depth (number of programmable interfaces) and computational capability. Whereas a single-layer metasurface can implement only a fixed phase-amplitude mask, an LL-layer SIM enables cascaded neural-network-like analog processing. For equal surface area, an LL-layer SIM outperforms a one-layer device by up to $200$–300%300\% in communication and sensing benchmarks (Li et al., 1 Mar 2025).

In contrast to multi-layer dielectric or fixed-phase lenses, all layers in an SIM are reconfigurable, offering dynamic and context-aware adaptation for evolving wireless tasks (Renzo, 2024). Modern implementations support active (amplitude-controlled) and passive (phase-only) partitioning for site-adapted gain-vs-noise optimization (Darsena et al., 27 Oct 2025, Iudice et al., 28 Jan 2026).

6. Key Challenges and Research Directions

  • Electromagnetic modeling: Scaling to ultra-large apertures (N104N \gtrsim 10^4 meta-atoms) requires precise calibration and advanced modeling of mutual coupling and nonparaxial propagation (Abrardo et al., 5 Jan 2025).
  • Control and integration: Managing per-atom control signals (via FPGA/ASIC), power supply, and thermal dissipation for large N,LN, L is nontrivial.
  • Fabrication tolerances: Sub-mm layer alignment and phase precision must be maintained for high-resolution tasks; on-line calibration and in situ optimization are essential (An et al., 22 Jan 2026).
  • Learning and optimization: Data-driven, hardware-in-the-loop, and AI-native design approaches (as realized in NVIDIA Sionna/TensorFlow) are now available for end-to-end, differentiable training, enabling practical system deployment in complex, time-varying environments (Iudice et al., 28 Jan 2026).
  • Future applications: Emerging directions include joint wave-based communications and radar (ISAC), semantic-aware physical-layer designs, direct-wavespace AI, and the integration of nonlinear and active meta-atoms for universal analog computation (Renzo, 2024, An et al., 22 Jan 2026).

7. Summary Table: Representative SIM Application Domains

Application Modeled/Emulated Function Key Performance Gains
Multi-modal SemCom Edge imaging + text fusion Bandwidth savings, SSIM↑
MIMO/HMIMO Analog channel diagonalization 2–3× sum-rate, hardware↓
ISAC Communication + beampattern SE↑, sensing MSE↓
DOA Estimation Physical 2D DFT engine MSE 10410^{-4}, no RF chains
Task-oriented SemCom Direct analog image recognition >>90% accuracy, latency↓

These summarized outcomes highlight the capacity of SIMs to unify communications, analog computing, and sensing with high speed, energy efficiency, and functional versatility. Systematic advances in EM modeling, control integration, and optimization will be crucial for large-scale deployment in 6G and beyond (Huang et al., 14 Jun 2025, An et al., 2023, An et al., 2023, Niu et al., 2024, Huang et al., 2024, Renzo, 2024, Abrardo et al., 5 Jan 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Stacked Intelligent Metasurface (SIM).