Papers
Topics
Authors
Recent
Search
2000 character limit reached

CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect

Published 17 Apr 2024 in cs.CV | (2404.11429v1)

Abstract: In the food industry, assessing the quality of poultry carcasses during processing is a crucial step. This study proposes an effective approach for automating the assessment of carcass quality without requiring skilled labor or inspector involvement. The proposed system is based on ML and computer vision (CV) techniques, enabling automated defect detection and carcass quality assessment. To this end, an end-to-end framework called CarcassFormer is introduced. It is built upon a Transformer-based architecture designed to effectively extract visual representations while simultaneously detecting, segmenting, and classifying poultry carcass defects. Our proposed framework is capable of analyzing imperfections resulting from production and transport welfare issues, as well as processing plant stunner, scalder, picker, and other equipment malfunctions. To benchmark the framework, a dataset of 7,321 images was initially acquired, which contained both single and multiple carcasses per image. In this study, the performance of the CarcassFormer system is compared with other state-of-the-art (SOTA) approaches for both classification, detection, and segmentation tasks. Through extensive quantitative experiments, our framework consistently outperforms existing methods, demonstrating remarkable improvements across various evaluation metrics such as AP, AP@50, and AP@75. Furthermore, the qualitative results highlight the strengths of CarcassFormer in capturing fine details, including feathers, and accurately localizing and segmenting carcasses with high precision. To facilitate further research and collaboration, the pre-trained model and source code of CarcassFormer is available for research purposes at: \url{https://github.com/UARK-AICV/CarcassFormer}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (104)
  1. The robotic workbench and poultry processing 2.0. Animal Frontiers 12, 49–55.
  2. World agriculture towards 2030/2050: the 2012 revision. Agriculture Development Economics Division. Food and Agriculture Organization of the United Nations .
  3. Bottom-up instance segmentation using deep higher-order crfs, in: Wilson, R.C., Hancock, E.R., Smith, W.A.P. (Eds.), Proceedings of the British Machine Vision Conference 2016, BMVC 2016, York, UK, September 19-22, 2016, BMVA Press. URL: http://www.bmva.org/bmvc/2016/papers/paper019/index.html.
  4. Development of an early detection system for lameness of broilers using computer vision. Computers and Electronics in Agriculture 136, 140–146. doi:10.1016/j.compag.2017.02.019.
  5. YOLACT: real-time instance segmentation, in: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, IEEE. pp. 9156–9165. URL: https://doi.org/10.1109/ICCV.2019.00925, doi:10.1109/ICCV.2019.00925.
  6. Cascade R-CNN: delving into high quality object detection, in: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, IEEE Computer Society. pp. 6154–6162. doi:10.1109/CVPR.2018.00644.
  7. Detection of woody breast condition in commercial broiler carcasses using image analysis. Poultry Science , 100977doi:10.1016/j.psj.2020.12.074.
  8. End-to-end object detection with transformers, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, Springer. pp. 213–229.
  9. Blendmask: Top-down meets bottom-up for instance segmentation, in: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, IEEE. pp. 8570–8578. URL: https://doi.org/10.1109/CVPR42600.2020.00860, doi:10.1109/CVPR42600.2020.00860.
  10. Hybrid task cascade for instance segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4974–4983.
  11. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40, 834–848.
  12. Encoder-decoder with atrous separable convolution for semantic image segmentation, in: Proceedings of the European conference on computer vision (ECCV), pp. 801–818.
  13. Masked-attention mask transformer for universal image segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1290–1299.
  14. Per-pixel classification is not all you need for semantic segmentation. Advances in Neural Information Processing Systems 34, 17864–17875.
  15. Boundary-preserving mask r-cnn, in: European conference on computer vision, Springer. pp. 660–676.
  16. U.S. Broiler Performance. https://www.nationalchickencouncil.org/statistic/us-broiler-performance/. [Online; accessed 9-July-2021].
  17. A survey on deep learning and its applications. Computer Science Review 40, 100379.
  18. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR) .
  19. Mobiface: A lightweight deep learning face recognition on mobile devices, in: 2019 IEEE 10th international conference on biometrics theory, applications and systems (BTAS), IEEE. pp. 1–6.
  20. Learning from longitudinal face demonstration—where tractable deep modeling meets inverse reinforcement learning. International Journal of Computer Vision 127, 957–971.
  21. Live chicken production trends.
  22. S4net: Single stage salient-instance segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6103–6112.
  23. Instances as queries, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 6910–6919.
  24. FASS, 2010. Guide for the Care and Use of Agricultural Animals in Research and Teaching. 3rd editio ed., Federation of Animal Science Societies. URL: http://www.fass.orgorfromthe.
  25. Semantic instance segmentation via deep metric learning. arXiv preprint arXiv:1703.10277 .
  26. Multi-modal transformer for video retrieval, in: European Conference on Computer Vision, Springer. pp. 214–229.
  27. Run-length encodings (corresp.). IEEE transactions on information theory 12, 399–401.
  28. A review on 2d instance segmentation based on deep neural networks. Image and Vision Computing , 104401.
  29. A survey on instance segmentation: state of the art. International journal of multimedia information retrieval 9, 171–189.
  30. An automatic cells detection and segmentation, in: Medical Imaging 2017: Biomedical Applications in Molecular, Structural, and Functional Imaging, SPIE. pp. 224–231.
  31. Dynamic focus-aware positional queries for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11299–11308.
  32. Mask r-cnn, in: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969.
  33. Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778.
  34. A reliable and low-cost deep learning model integrating convolutional neural network and transformer structure for fine-grained classification of chicken eimeria species. Poultry Science 102, 102459.
  35. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 .
  36. Attention-guided instance segmentation for group-raised pigs. Animals 13, 2181.
  37. Densely connected convolutional networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708.
  38. Yolo-lite: a real-time object detection algorithm optimized for non-gpu computers, in: 2018 IEEE international conference on big data (big data), IEEE. pp. 2503–2510.
  39. Multiresunet: Rethinking the u-net architecture for multimodal biomedical image segmentation. Neural networks 121, 74–87.
  40. Computer vision for autonomous vehicles: Problems, datasets and state of the art. Foundations and Trends® in Computer Graphics and Vision 12, 1–308.
  41. Development status and trend of agricultural robot technology. International Journal of Agricultural and Biological Engineering 14, 1–19.
  42. Re-Moo-Ving Barriers Within Labor: Exploring Current Events Related to Dairy and Poultry Labor Markets. Michigan State University.
  43. Recurrent pixel embedding for instance grouping, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9018–9028.
  44. Imagenet classification with deep convolutional neural networks. Communications of the ACM 60, 84–90.
  45. scl-st: Supervised contrastive learning with semantic transformations for multiple lead ecg arrhythmia classification. IEEE journal of biomedical and health informatics .
  46. Narrow band active contour attention model for medical segmentation. Diagnostics 11, 1393.
  47. Deep reinforcement learning in computer vision: a comprehensive survey. Artificial Intelligence Review , 1–87.
  48. Deep recurrent level set for segmenting brain tumors, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 646–653.
  49. Semi self-training beard/moustache detection and segmentation simultaneously. Image and Vision Computing 58, 214–223.
  50. Robust hand detection and classification in vehicles and in the wild., in: CVPR Workshops, pp. 1203–1210.
  51. A novel shape constrained feature-based active contour model for lips/mouth segmentation in the wild. Pattern Recognition 54, 23–33.
  52. Deepsafedrive: A grammar-aware driver parsing approach to driver behavioral situational awareness (db-saw). Pattern Recognition 66, 229–238.
  53. Centermask: Real-time anchor-free instance segmentation, in: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, IEEE. pp. 13903–13912. URL: https://doi.org/10.1109/CVPR42600.2020.01392, doi:10.1109/CVPR42600.2020.01392.
  54. Dn-detr: Accelerate detr training by introducing query denoising, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13619–13627.
  55. Mask dino: Towards a unified transformer-based framework for object detection and segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3041–3050.
  56. Amodal instance segmentation, in: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, Springer. pp. 677–693.
  57. Fully convolutional instance-aware semantic segmentation, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, IEEE Computer Society. pp. 4438–4446. URL: https://doi.org/10.1109/CVPR.2017.472, doi:10.1109/CVPR.2017.472.
  58. Network in network. arXiv preprint arXiv:1312.4400 .
  59. Microsoft coco: Common objects in context, in: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, Springer. pp. 740–755.
  60. Judgment model of cock reproductive performance based on vison transformer, in: Proceedings of the 2022 5th International Conference on Sensors, Signal and Image Processing, pp. 37–42.
  61. Dab-detr: Dynamic anchor boxes are better queries for detr. arXiv preprint arXiv:2201.12329 .
  62. Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022.
  63. Fully convolutional networks for semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440.
  64. A deep learning approach to joint face detection and segmentation. Advances in face detection and facial image analysis , 1–12.
  65. Associative embedding: End-to-end learning for joint detection and grouping. Advances in neural information processing systems 30.
  66. Multi-camera multiple 3d object tracking on the move for autonomous vehicles, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2569–2578.
  67. Embryosformer: Deformable transformer and collaborative encoding-decoding for embryos stage development classification, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1981–1990.
  68. Learning to segment object candidates. Advances in neural information processing systems 28.
  69. Artificial intelligence, sensors, robots, and transportation systems drive an innovative future for poultry broiler and breeder management. Animal Frontiers 12, 40–48.
  70. Non-volume preserving-based fusion to group-level emotion recognition on crowd videos. Pattern Recognition 128, 108646.
  71. Agricultural robotics research applicable to poultry production: A review. Computers and Electronics in Agriculture 169, 105216.
  72. U-net: Convolutional networks for biomedical image segmentation, in: International Conference on Medical image computing and computer-assisted intervention, Springer. pp. 234–241.
  73. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 .
  74. Sparse r-cnn: End-to-end object detection with learnable proposals, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14454–14463.
  75. Going deeper with convolutions, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9.
  76. I-ai: A controllable & interpretable ai system for decoding radiologists’ intense focus for accurate cxr diagnoses. arXiv e-prints , arXiv–2309.
  77. Recent advances in small object detection based on deep learning: A review. Image and Vision Computing 97, 103910.
  78. Training data-efficient image transformers & distillation through attention, in: International conference on machine learning, PMLR. pp. 10347–10357.
  79. Ss-3dcapsnet: Self-supervised 3d capsule networks for medical segmentation on less labeled data, in: 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), IEEE. pp. 1–5.
  80. Aisformer: Amodal instance segmentation with transformer. British Machine Vision Conference (BMVC) .
  81. 3dconvcaps: 3dunet with convolutional capsule encoder for medical image segmentation, in: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 4392–4398. doi:10.1109/ICPR56361.2022.9956588.
  82. Otadapt: Optimal transport-based approach for unsupervised domain adaptation, in: 2022 26th International Conference on Pattern Recognition (ICPR), IEEE. pp. 2850–2856.
  83. USDA, . Poultry-Grading Manual. United States Department of Agriculture.
  84. A survey on semi-supervised learning. Machine learning 109, 373–440.
  85. Attention is all you need, in: NIPS, pp. 5998–6008.
  86. AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation. BMVC .
  87. Aoe-net: Entities interactions modeling with adaptive attention mechanism for temporal action proposals generation. International Journal of Computer Vision , 1–22.
  88. On semantic similarity in video retrieval, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3650–3660.
  89. Information perception in modern poultry farming: A review. Computers and Electronics in Agriculture 199, 107131.
  90. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems 34, 12077–12090.
  91. Vlcap: Vision-language with contrastive learning for coherent video paragraph captioning, in: 2022 IEEE International Conference on Image Processing (ICIP), IEEE. pp. 3656–3661.
  92. Vltint: Visual-linguistic transformer-in-transformer for coherent video paragraph captioning. AAAI .
  93. Cross-modal self-attention network for referring image segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10502–10511.
  94. Embedmask: Embedding coupling for one-stage instance segmentation. ArXiv preprint abs/1912.01954. URL: https://arxiv.org/abs/1912.01954.
  95. Pose2seg: Detection free human instance segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 889–898.
  96. Shufflenet: An extremely efficient convolutional neural network for mobile devices, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6848–6856.
  97. Pyramid scene parsing network, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2881–2890.
  98. Research on automatic classification and detection of mutton multi-parts based on swin-transformer. Foods 12, 1642.
  99. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6881–6890.
  100. Deep reinforcement learning in medical imaging: A literature review. Medical image analysis 73, 102193.
  101. Unet++: A nested u-net architecture for medical image segmentation, in: Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, pp. 3–11.
  102. A brief introduction to weakly supervised learning. National science review 5, 44–53.
  103. Weakly supervised facial analysis with dense hyper-column features, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 25–33.
  104. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 .
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.