CA-Net: Comprehensive Attention Convolutional Neural Networks for Explainable Medical Image Segmentation

Published 22 Sep 2020 in eess.IV and cs.CV | (2009.10549v2)

Abstract: Accurate medical image segmentation is essential for diagnosis and treatment planning of diseases. Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance for automatic medical image segmentation. However, they are still challenged by complicated conditions where the segmentation target has large variations of position, shape and scale, and existing CNNs have a poor explainability that limits their application to clinical decisions. In this work, we make extensive use of multiple attentions in a CNN architecture and propose a comprehensive attention-based CNN (CA-Net) for more accurate and explainable medical image segmentation that is aware of the most important spatial positions, channels and scales at the same time. In particular, we first propose a joint spatial attention module to make the network focus more on the foreground region. Then, a novel channel attention module is proposed to adaptively recalibrate channel-wise feature responses and highlight the most relevant feature channels. Also, we propose a scale attention module implicitly emphasizing the most salient feature maps among multiple scales so that the CNN is adaptive to the size of an object. Extensive experiments on skin lesion segmentation from ISIC 2018 and multi-class segmentation of fetal MRI found that our proposed CA-Net significantly improved the average segmentation Dice score from 87.77% to 92.08% for skin lesion, 84.79% to 87.08% for the placenta and 93.20% to 95.88% for the fetal brain respectively compared with U-Net. It reduced the model size to around 15 times smaller with close or even better accuracy compared with state-of-the-art DeepLabv3+. In addition, it has a much higher explainability than existing networks by visualizing the attention weight maps. Our code is available at https://github.com/HiLab-git/CA-Net

Abstract PDF Upgrade to Chat

Citations (416)

View on Semantic Scholar

Summary

The paper introduces CA-Net, integrating joint spatial, channel, and scale attention to improve medical image segmentation.
It achieves significant Dice score improvements over U-Net while reducing model size, as demonstrated on skin lesion and fetal MRI datasets.
The model enhances explainability by visualizing attention weights, providing transparent and trustworthy decision processes for clinical use.

CA-Net: Comprehensive Attention Convolutional Neural Networks for Explainable Medical Image Segmentation

The paper "CA-Net: Comprehensive Attention Convolutional Neural Networks for Explainable Medical Image Segmentation" addresses the challenge of accurate and explainable medical image segmentation using convolutional neural networks (CNNs). Conventional CNNs, although capable of remarkable segmentation performance, often struggle with variations in the position, shape, and scale of medical entities, and they typically operate as opaque systems with limited explainability, constraining their utility in clinical settings. This paper presents CA-Net, an innovative CNN architecture that incorporates multiple attention mechanisms to enhance both the precision and interpretability of medical image segmentation.

Technical Contribution

CA-Net introduces a novel framework that integrates multiple attention modules, each focusing on distinct aspects of the feature learning process:

Joint Spatial Attention: This module integrates dual-pathway attention gates and non-local blocks to emphasize significant spatial regions. It builds upon prior efforts to develop spatially-aware networks by seamlessly integrating attention at multiple resolution levels, thus promoting focus on relevant spatial features.
Channel Attention: The method recalibrates channel-wise feature responses using both average and max pooling feature maps. This technique highlights channels with the most useful semantic information, mitigating channel redundancy.
Scale Attention: CA-Net leverages scale attention to accentuate feature maps from different scales based on their relevance to the segmentation task. This allows the network to adapt to the inherent scale variations of medical targets.

Experimental Insights

The authors evaluate CA-Net on the ISIC 2018 skin lesion dataset and a multi-class fetal MRI dataset. The results indicate substantial gains in the segmentation Dice scores with improvements from 87.77% to 92.08% for skin lesions, 84.79% to 87.08% for placental segmentation, and 93.20% to 95.88% for fetal brain segmentation, compared to a standard U-Net. Additionally, CA-Net's model size is reduced by a factor of nearly 15 compared to DeepLabv3+, while still achieving comparable or superior accuracy. This demonstrates CA-Net's efficiency and potential for deployment in resource-constrained environments.

Importantly, CA-Net enhances transparency by enabling the visualization of attention weights at various stages of processing. This demystifies the model's decision-making process, fostering trust and facilitating clinical adoption.

Implications and Future Directions

CA-Net's attention-driven model presents a significant advancement in ensuring that CNNs used for medical image segmentation are both accurate and explainable. By enhancing explainability, CA-Net supports practitioners in verifying model predictions and understanding the image features driving network outputs. This characteristic is crucial for clinical applications where interpretability could impact treatment strategies.

In future work, researchers could explore extending the use of comprehensive attention to other medical imaging modalities and consider implementing attention mechanisms that exploit temporal information in dynamic imaging datasets. Additionally, exploring lightweight architecture optimizations would be beneficial for real-time clinical deployment in environments with computational limitations. The conceptual framework of CA-Net may also find applications beyond medical imagery, useful in any domain involving complex image analysis where both precision and explainability are paramount.

Overall, the paper offers a significant contribution to the medical image segmentation field by addressing pressing needs for accuracy, efficiency, and interpretability in AI-driven clinical tools. CA-Net's approach provides a robust foundation for further innovations in developing practical and transparent AI systems in healthcare.