Papers
Topics
Authors
Recent
Search
2000 character limit reached

CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection

Published 23 Apr 2024 in cs.CV | (2404.15451v2)

Abstract: Feature pyramids have been widely adopted in convolutional neural networks and transformers for tasks in medical image segmentation. However, existing models generally focus on the Encoder-side Transformer for feature extraction. We further explore the potential in improving the feature decoder with a well-designed architecture. We propose Cross Feature Pyramid Transformer decoder (CFPFormer), a novel decoder block that integrates feature pyramids and transformers. Even though transformer-like architecture impress with outstanding performance in segmentation, the concerns to reduce the redundancy and training costs still exist. Specifically, by leveraging patch embedding, cross-layer feature concatenation mechanisms, CFPFormer enhances feature extraction capabilities while complexity issue is mitigated by our Gaussian Attention. Benefiting from Transformer structure and U-shaped connections, our work is capable of capturing long-range dependencies and effectively up-sample feature maps. Experimental results are provided to evaluate CFPFormer on medical image segmentation datasets, demonstrating the efficacy and effectiveness. With a ResNet50 backbone, our method achieves 92.02\% Dice Score, highlighting the efficacy of our methods. Notably, our VGG-based model outperformed baselines with more complex ViT and Swin Transformer backbone.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. End-to-End Object Detection with Transformers, May 2020. arXiv:2005.12872 [cs] version: 3.
  2. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation, February 2021. arXiv:2102.04306 [cs].
  3. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019.
  4. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, June 2021. arXiv:2010.11929 [cs].
  5. Centernet: Keypoint triplets for object detection, 2019.
  6. CenterNet: Keypoint Triplets for Object Detection, April 2019. arXiv:1904.08189 [cs] version: 3.
  7. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.
  8. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
  9. Deep Residual Learning for Image Recognition, December 2015. arXiv:1512.03385 [cs].
  10. Sage Bionetworks [email protected]. Synapse | Sage Bionetworks.
  11. Adam: A method for stochastic optimization, 2017.
  12. Microsoft COCO: Common Objects in Context, February 2015. arXiv:1405.0312 [cs].
  13. Swin transformer: Hierarchical vision transformer using shifted windows, 2021.
  14. Decoupled weight decay regularization, 2019.
  15. Attention u-net: Learning where to look for the pancreas, 2018.
  16. U-Net: Convolutional Networks for Biomedical Image Segmentation, May 2015. arXiv:1505.04597 [cs].
  17. The fully convolutional transformer for medical image segmentation, 2023.
  18. Attention Is All You Need, August 2023. arXiv:1706.03762 [cs].
  19. Uctransnet: Rethinking the skip connections in u-net from a channel-wise perspective with transformer, 2022.
  20. CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation, March 2021. arXiv:2103.03024 [cs].
  21. Dilated Residual Networks, May 2017. arXiv:1705.09914 [cs].
  22. Unet++: A nested u-net architecture for medical image segmentation, 2018.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.