SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

Published 24 Jan 2024 in cs.CV | (2401.13560v4)

Abstract: The Transformer architecture has shown a remarkable ability in modeling global relationships. However, it poses a significant computational challenge when processing high-dimensional medical images. This hinders its development and widespread adoption in this task. Mamba, as a State Space Model (SSM), recently emerged as a notable manner for long-range dependencies in sequential modeling, excelling in natural language processing filed with its remarkable memory efficiency and computational speed. Inspired by its success, we introduce SegMamba, a novel 3D medical image \textbf{Seg}mentation \textbf{Mamba} model, designed to effectively capture long-range dependencies within whole volume features at every scale. Our SegMamba, in contrast to Transformer-based methods, excels in whole volume feature modeling from a state space model standpoint, maintaining superior processing speed, even with volume features at a resolution of {$64\times 64\times 64$}. Comprehensive experiments on the BraTS2023 dataset demonstrate the effectiveness and efficiency of our SegMamba. The code for SegMamba is available at: https://github.com/ge-xing/SegMamba

Abstract PDF HTML Upgrade to Chat

References (2)

Citations (117)

View on Semantic Scholar

Summary

The paper introduces SegMamba, a novel approach integrating state space models into a U-shaped network for superior 3D medical segmentation.
It replaces transformer self-attention with efficient Mamba blocks to capture multi-scale and global features while reducing computational load.
Experimental evaluations on BraTS2023 demonstrate Dice scores of 93.61%, 92.65%, and 87.71%, underscoring significant improvements over existing models.

Introduction to SegMamba

In the quest for advancements in 3D medical image segmentation, attention has shifted towards unveiling new methods that combine both efficiency and accuracy. Amidst such developments, the introduction of state space models (SSMs), specifically the Mamba model, has sparked significant interest. Designed to adeptly capture long-range dependencies within data sequences, Mamba has been chiefly lauded for its efficiency in natural language processing. In this vein, the introduction of SegMamba signals a paradigm shift for 3D medical image segmentation, wherein SSMs are now paving the way towards new heights of performance and computational speed.

Encoder and Decoder Architecture

SegMamba aligns itself with this innovative trajectory. Its architecture comprises a Mamba encoder with multiple blocks to extract multi-scale features efficiently, a 3D convolutional neural network (CNN)-based decoder for segmentation predictions, and strategically designed skip connections facilitating feature reuse. Crucially, the Mamba block replaces the self-attention module of the transformer, maintaining multi-scale and global feature modeling capabilities while eschewing the high computational demand. This encoder integrates depth-wise convolutions with an initial stem layer followed by the Mamba block for sequence processing, and reconstructs back to 3D from 1D to preserve spatial integrity.

Experimental Results

Evaluation of the SegMamba was executed on the BraTS2023 dataset comprising 1,251 3D brain MRI volumes. Here, SegMamba's performance was quantitatively assessed using both the Dice similarity coefficient and the 95% Hausdorff distance (HD95). In comparison to CNN-based and transformer-based methods, SegMamba attained commendable results. Specifically, it surpassed UX-Net, SwinUNETR-V2, and other leading models with Dice scores of 93.61%, 92.65%, and 87.71% for Whole Tumor (WT), Tumor Core (TC), and Enhancing Tumor (ET) segmentation targets, respectively. Moreover, the HD95 results of 3.37, 3.85, and 3.48 showcased significant enhancements in precision.

Conclusion and Implications

The implications of SegMamba's breakthrough are profound. By leveraging the Mamba model within a U-shaped network architecture, it not only outperforms its predecessors but also offers a pioneering and highly efficient alternative to the computationally intensive transformer methods traditionally employed in 3D medical image segmentation. The ramifications of these findings are set to reverberate across medical imaging and diagnostics, as practitioners can now employ models that offer not only increased accuracy but also the swift processing indispensable in clinical settings. For those interested in further exploration or implementation, the authors of SegMamba have made their code publicly accessible, inviting widespread adaptation and potential augmentation of this revolutionary approach.

Markdown Report Issue