- The paper introduces a novel multi-scale state-space framework that significantly reduces computational demands for radar segmentation and object detection.
- It processes ADC samples sequentially through sample-wise and chirp-wise SSMs, capturing both intra- and inter-chirp dynamics efficiently.
- Experimental results on RADIal and RaDICaL datasets show competitive accuracy, with over 60x computational savings and impressive segmentation scores.
Overview of SSMRadNet: A Sample-wise State-Space Framework for Radar Processing
The paper "SSMRadNet: A Sample-wise State-Space Framework for Efficient and Ultra-Light Radar Segmentation and Object Detection" (2511.08769) introduces a novel approach for processing Frequency Modulated Continuous Wave (FMCW) radar data using a multi-scale State Space Model (SSM) framework. This architecture, designed specifically for radar segmentation and object detection, significantly reduces computational demands while maintaining competitive performance metrics. The authors present SSMRadNet as a highly efficient framework that sequentially processes raw Analog-to-Digital Converter (ADC) samples through two SSMs to generate meaningful representations for segmentation and detection tasks.
Technical Approach
Motivation and Background
Traditional radar perception models rely on processing dense point clouds generated from 3D Range-Azimuth-Doppler (RAD) tensors using multiple stages of Fast Fourier Transforms (FFTs). This method, while effective, introduces significant computational overhead. In contrast, recent advances have explored learning-based approaches that operate directly on ADC cubes. These include convolutional, recurrent, and attention-based networks. However, these methods often suffer from increased complexity and computational costs.
SSMRadNet addresses these challenges by leveraging the state-space modeling paradigm, which allows for efficient processing of radar data. The proposed framework models radar data as sequences of tokens, enabling the architecture to capture long-duration dependencies through SSMs.
Architecture Design
The architecture of SSMRadNet consists of several key components:
- Sample-wise SSM: This component captures intra-chirp correlations by sequentially processing samples from radar receiver channels. Each chirp is processed to extract a feature vector representing range information.
- Chirp-wise SSM: Following intra-chirp processing, chirp-wise features are sequentially processed to capture inter-chirp dynamics such as motion and velocity, generating a comprehensive representation of the radar frame.
- Decoder: The latent representations obtained from SSMs are decoded to produce bird's-eye-view (BEV) occupancy maps for segmentation and detection tasks. The decoder incorporates spatial projection layers and convolutional blocks to refine the output maps.
Figure 1: SSMRadNet Architecture: Raw complex ADC samples from NRX​​-channels feed into sample-SSM blocks.
Experimental Results
SSMRadNet's efficacy is demonstrated on two major radar vision datasets: RADIal and RaDICaL. The results highlight the framework's ability to achieve significant reductions in parameter count and computational demands while maintaining competitive accuracy.
Implications and Future Work
The introduction of SSMs for radar processing opens a new avenue for efficient and scalable multi-task radar perception. The linear computation scaling with sequence length makes this approach particularly suitable for advanced radar systems with increased resolution and complexity.
Future research could involve integrating SSMRadNet with multi-modal data sources, such as cameras and LiDAR, to enhance perception capabilities. Additionally, exploring robustness improvements under adverse weather conditions and incorporating motion-aware tracking could further extend the framework's applicability in autonomous systems.
Conclusion
SSMRadNet is a compelling advancement in radar processing, offering an efficient alternative to conventional radar perception models. By utilizing a sample-wise state-space approach, the framework achieves substantial computational savings while maintaining high accuracy in segmentation and detection tasks across multiple datasets. This work sets a benchmark for developing lightweight, radar-specific neural architectures for real-time applications in autonomous driving and beyond.