Papers
Topics
Authors
Recent
Search
2000 character limit reached

Learning to Generate Training Datasets for Robust Semantic Segmentation

Published 1 Aug 2023 in cs.CV and cs.LG | (2308.02535v4)

Abstract: Semantic segmentation methods have advanced significantly. Still, their robustness to real-world perturbations and object types not seen during training remains a challenge, particularly in safety-critical applications. We propose a novel approach to improve the robustness of semantic segmentation techniques by leveraging the synergy between label-to-image generators and image-to-label segmentation models. Specifically, we design Robusta, a novel robust conditional generative adversarial network to generate realistic and plausible perturbed images that can be used to train reliable segmentation models. We conduct in-depth studies of the proposed generative model, assess the performance and robustness of the downstream segmentation network, and demonstrate that our approach can significantly enhance the robustness in the face of real-world perturbations, distribution shifts, and out-of-distribution samples. Our results suggest that this approach could be valuable in safety-critical applications, where the reliability of perception modules such as semantic segmentation is of utmost importance and comes with a limited computational budget in inference. We release our code at https://github.com/ENSTA-U2IS-AI/robusta.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (142)
  1. Jesús Angulo. Morphological colour operators in totally ordered lattices based on distances: Application to image filtering, enhancement and analysis. Computer vision and image understanding, 2007.
  2. Semantic bottleneck scene generation. arXiv preprint arXiv:1911.11357, 2019.
  3. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. TPAMI, 2017.
  4. Are transformers more robust than cnns? NeurIPS, 2021.
  5. A theory of learning from different domains. Machine learning, 2010.
  6. Adversarial robustness comparison of vision transformer and mlp-mixer to cnns. arXiv preprint arXiv:2110.02797, 2021.
  7. Triggering failures: Out-of-distribution detection by learning from local adversarial attacks in semantic segmentation. In ICCV, 2021.
  8. This dataset does not exist: training models from generated images. In ICASSP, 2020.
  9. Simultaneous semantic segmentation and outlier detection in presence of domain shift. In DAGM GCPR, 2019.
  10. Understanding robustness of transformers for image classification. In ICCV, 2021.
  11. Large scale gan training for high fidelity natural image synthesis. In ICLR, 2019.
  12. Coco-stuff: Thing and stuff classes in context. In CVPR, 2018.
  13. Probabilistic pseudo-morphology for grayscale and color images. Pattern Recognition, 2014.
  14. Segmentmeifyoucan: A benchmark for anomaly segmentation. In NeurIPS - Datasets and Benchmarks Track, 2021.
  15. Entropy maximization and meta classification for out-of-distribution detection in semantic segmentation. In ICCV, 2021.
  16. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. TPAMI, 2017.
  17. Encoder-decoder with atrous separable convolution for semantic image segmentation. In ECCV, 2018.
  18. The cityscapes dataset for semantic urban scene understanding. In CVPR, 2016.
  19. David R Cox. The regression analysis of binary sequences. JRSS, 1958.
  20. Closer look at the uncertainty estimation in semantic segmentation under distributional shift. In IJCNN, 2021.
  21. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
  22. Pixel-wise anomaly detection in complex driving scenes. In CVPR, 2021.
  23. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
  24. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
  25. Vos: Learning what you don’t know by virtual outlier synthesis. ICLR, 2022.
  26. Robust physical-world attacks on deep learning visual classification. In CVPR, 2018.
  27. Learning hierarchical features for scene labeling. TPAMI, 2012.
  28. Deep ensembles: A loss landscape perspective. In arXiv preprint arXiv:1912.02757, 2019.
  29. Ordering on the probability simplex of endmembers for hyperspectral morphological image processing. In ISMM, 2015.
  30. Robust semantic segmentation with superpixel-mix. In BMVC, 2021.
  31. Encoding the latent posterior of bayesian neural networks for uncertainty quantification. arXiv preprint arXiv:2012.02818, 2020.
  32. Tradi: Tracking deep neural network weight distributions. In ECCV, 2020.
  33. Muad: Multiple uncertainties for autonomous driving benchmark for multiple uncertainty types and tasks. BMVC, 2022.
  34. Semi-supervised semantic segmentation needs strong, varied perturbations. In BMVC, 2020.
  35. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In ICML, 2016.
  36. Concrete dropout. NeurIPS, 2017.
  37. Far away in the deep space: Nearest-neighbor-based dense out-of-distribution detection. arXiv preprint arXiv:2211.06660, 2022.
  38. Loss surfaces, mode connectivity, and fast ensembling of dnns. NeurIPS, 2018.
  39. A survey of uncertainty in deep neural networks. arXiv preprint arXiv:2107.03342, 2021.
  40. Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231, 2018.
  41. What do vision transformers learn? a visual exploration. arXiv preprint arXiv:2212.06727, 2022.
  42. Simple copy-paste is a strong data augmentation method for instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2918–2928, 2021.
  43. Generative adversarial nets. In NeurIPS, 2014.
  44. Dense open-set recognition with synthetic outliers generated by real nvp. arXiv preprint arXiv:2011.11094, 2020.
  45. Dense anomaly detection by robust learning on synthetic negative data. arXiv preprint arXiv:2112.12833, 2021.
  46. Densehybrid: Hybrid anomaly detection for dense open-set recognition. In ECCV, 2022.
  47. On calibration of modern neural networks. In ICML, 2017.
  48. Training independent subnetworks for robust prediction. In ICLR, 2020.
  49. Mask r-cnn. In ICCV, 2017.
  50. Deep residual learning for image recognition. In CVPR, 2016.
  51. Is synthetic data from generative models ready for image recognition? In ICLR, 2023.
  52. Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In CVPR, 2019.
  53. A benchmark for anomaly segmentation. In arXiv preprint arXiv:1911.11132, 2019.
  54. Scaling out-of-distribution detection for real-world settings. In ICML, 2019.
  55. Benchmarking neural network robustness to common corruptions and perturbations. In ICLR, 2019.
  56. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136, 2016.
  57. Deep anomaly detection with outlier exposure. In ICLR, 2019.
  58. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In NeurIPS, 2017.
  59. Conditional generative adversarial networks for data augmentation and adaptation in remotely sensed imagery. In Applications of Machine Learning, 2019.
  60. Depth-attentional features for single-image rain removal. In CVPR, 2019.
  61. Snapshot ensembles: Train 1, get m for free. ICLR, 2017.
  62. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Machine-Learning, 2021.
  63. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 2015.
  64. Image-to-image translation with conditional adversarial networks. In CVPR, 2017.
  65. Paul Jaccard. The distribution of the flora in the alpine zone. New phytologist, 1912.
  66. Complex urban dataset with multi-level sensors from highly diverse urban environments. IJRR, 2019.
  67. Tsit: A simple and versatile framework for image-to-image translation. In ECCV, 2020.
  68. Alias-free generative adversarial networks. In NeurIPS, 2021.
  69. What uncertainties do we need in bayesian deep learning for computer vision? In NeurIPS, 2017.
  70. Simple and scalable predictive uncertainty estimation using deep ensembles. In NeurIPS, 2017.
  71. Packed ensembles for efficient uncertainty estimation. In ICLR, 2023.
  72. Technology readiness levels for machine learning systems. Nature Communications, 2022.
  73. Hierarchical conditional relation networks for video question answering. In CVPR, 2020.
  74. Semantic palette: Guiding scene generation with class proportions. In CVPR, 2021.
  75. Energy-based models in document recognition and computer vision. In ICDAR, 2007.
  76. Photo-realistic single image super-resolution using a generative adversarial network. In CVPR, 2017.
  77. Training confidence-calibrated classifiers for detecting out-of-distribution samples. In ICLR, 2018.
  78. Towards fully autonomous driving: Systems and algorithms. In IV, 2011.
  79. Contextual transformer networks for visual recognition. TPAMI, 2022.
  80. Microsoft coco: Common objects in context. In ECCV, 2014.
  81. Detecting the unexpected via image resynthesis. In ICCV, 2019.
  82. Learning to predict layout-to-image conditional convolutions for semantic image synthesis. NeurIPS, 2019.
  83. Deep gamblers: Learning to abstain with portfolio theory. NeurIPS, 2019.
  84. Time-dynamic estimates of the reliability of deep semantic segmentation networks. In ICTAI, 2020.
  85. Concrete problems for autonomous vehicle safety: Advantages of bayesian deep learning. In IJCAI, 2017.
  86. Benchmarking robustness in object detection: Autonomous driving when winter is coming. arXiv preprint arXiv:1907.07484, 2019.
  87. Evaluating bayesian deep learning methods for semantic segmentation. arXiv preprint arXiv:1811.12709, 2018.
  88. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In CVPR, 2015.
  89. Attention u-net: Learning where to look for the pancreas. In MIDL, 2018.
  90. Classmix: Segmentation-based data augmentation for semi-supervised learning. In WACV, 2021.
  91. Efficient semantic segmentation with pyramidal fusion. PR, 2021.
  92. Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. In NeurIPS, 2019.
  93. Image-to-image translation: Methods and applications. TMultimedia, 2021.
  94. Normalizing flows for probabilistic modeling and inference. JMLR, 2021.
  95. How do vision transformers work? ICLR, 2022.
  96. How do vision transformers work? In ICLR, 2022.
  97. Semantic image synthesis with spatially-adaptive normalization. In CVPR, 2019.
  98. Pytorch: An imperative style, high-performance deep learning library. In NeurIPS, 2019.
  99. Putting image manipulations in context: Robustness testing for safe perception. In SSRR, 2018.
  100. Comogan: continuous model-guided image-to-image translation. In CVPR, 2021.
  101. Semi-parametric image synthesis. In CVPR, 2018.
  102. Zero-shot text-to-image generation. In ICML, 2021.
  103. Classification accuracy score for conditional generative models. In NeurIPS, 2019.
  104. Panda: Adapting pretrained features for anomaly detection and segmentation. In CVPR, 2021.
  105. Faster r-cnn: Towards real-time object detection with region proposal networks. NeurIPS, 2015.
  106. Playing for data: Ground truth from computer games. In ECCV, 2016.
  107. High-dimensional probability estimation with deep density models. arXiv preprint arXiv:1302.5125, 2013.
  108. High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
  109. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015.
  110. Photorealistic text-to-image diffusion models with deep language understanding. arXiv preprint arXiv:2205.11487, 2022.
  111. Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision, 2018.
  112. Fake it till you make it: Learning (s) from a synthetic imagenet clone. arXiv preprint arXiv:2212.08420, 2022.
  113. Jean Serra. Image analysis and mathematical morphology. (No Title), 1982.
  114. Juergen Gall Shijie Li, Ming-Ming Cheng. Dual pyramid generative adversarial networks for semantic image synthesis. In BMVC, 2022.
  115. How good is my gan? In ECCV, 2018.
  116. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
  117. You only need adversarial supervision for semantic image synthesis. In ICLR, 2021.
  118. Going deeper with convolutions. In CVPR, 2015.
  119. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
  120. Local class-specific and global image-level generative adversarial networks for semantic-guided scene generation. In CVPR, 2020.
  121. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. NeurIPS, 2017.
  122. Pixel-wise energy-biased abstention learning for anomaly segmentation on complex urban driving scenes. In ECCV, 2022.
  123. Torchinfo. Torchinfo. https://github.com/TylerYep/torchinfo. Version: 1.7.1.
  124. Rain rendering for evaluating and improving robustness to bad weather. International Journal of Computer Vision, 2021.
  125. Instance normalization: The missing ingredient for fast stylization. arXiv:1607.08022, 2016.
  126. Attention is all you need. NeurIPS, 2017.
  127. Supervised ordering in i⁢rpisuperscriptr𝑝{\rm i}\!{\rm r}^{p}roman_i roman_r start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT: Application to morphological processing of hyperspectral images. IEEE Transactions on Image Processing, 2011.
  128. Road anomaly detection by partial image reconstruction with segmentation coupling. In ICCV, 2021.
  129. Pretraining is all you need for image-to-image translation. arXiv preprint arXiv:2205.12952, 2022.
  130. High-resolution image synthesis and semantic manipulation with conditional gans. In CVPR, 2018.
  131. BatchEnsemble: an alternative approach to efficient ensemble and lifelong learning. In ICLR, 2019.
  132. Synthesize then compare: Detecting failures and anomalies for semantic segmentation. In ECCV, 2020.
  133. Unified perceptual parsing for scene understanding. In ECCV, 2018.
  134. Segformer: Simple and efficient design for semantic segmentation with transformers. In NeurIPS, 2021.
  135. Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In CVPR, 2020.
  136. Dilated residual networks. In CVPR, 2017.
  137. Cutmix: Regularization strategy to train strong classifiers with localizable features. In CVPR, 2019.
  138. Wilddash-creating hazard-aware benchmarks. In ECCV, 2018.
  139. mixup: Beyond empirical risk minimization. In ICLR, 2018.
  140. Scene parsing through ade20k dataset. In CVPR, 2017.
  141. Cocosnet v2: Full-resolution correspondence learning for image translation. In CVPR, 2021.
  142. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV, 2017.
Citations (2)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.