Physically Feasible Semantic Segmentation
Abstract: State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion, minimizing solely per-pixel or per-segment classification objectives on their training data. This purely data-driven paradigm often leads to absurd segmentations, especially when the domain of input images is shifted from the one encountered during training. For instance, state-of-the-art models may assign the label road to a segment that is located above a segment that is respectively labeled assky, although our knowledge of the physical world dictates that such a configuration is not feasible for images captured by forward-facing upright cameras. Our method, Physically Feasible Semantic Segmentation (PhyFea), first extracts explicit constraints that govern spatial class relations from the semantic segmentation training set at hand in an offline, data-driven fashion, and then enforces a morphological yet differentiable loss that penalizes violations of these constraints during training to promote prediction feasibility. PhyFea is a plug-and-play method and yields consistent and significant performance improvements over diverse state-of-the-art networks on which we implement it across the ADE20K, Cityscapes, and ACDC datasets. Code and models will be made publicly available.
- Semantic image segmentation with deep convolutional nets and fully connected crfs, 2016.
- Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, 2017.
- Encoder-decoder with atrous separable convolution for semantic image segmentation, 2018.
- Masked-attention mask transformer for universal image segmentation, 2022.
- MMSegmentation Contributors. MMSegmentation: Openmmlab semantic segmentation toolbox and benchmark. https://github.com/open-mmlab/mmsegmentation, 2020.
- The cityscapes dataset for semantic urban scene understanding, 2016.
- An image is worth 16x16 words: Transformers for image recognition at scale, 2021.
- Digital image processing, third edition. Journal of Biomedical Optics, 14:029901, 2009.
- Ccnet: Criss-cross attention for semantic segmentation, 2020.
- Adaptive affinity fields for semantic segmentation, 2018.
- Efficient inference in fully connected crfs with gaussian edge potentials, 2012.
- Deep learning markov random field for semantic segmentation, 2017.
- Swin transformer: Hierarchical vision transformer using shifted windows, 2021.
- Fully convolutional networks for semantic segmentation, 2015.
- Incorporating prior knowledge in medical image segmentation: a survey, 2016.
- Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019.
- ”grabcut”: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph., 23(3):309–314, aug 2004.
- ACDC: The adverse conditions dataset with correspondences for semantic driving scene understanding. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021.
- Training region-based object detectors with online hard example mining, 2016.
- Unsupervised machine learning approaches to the q-state potts model. The European Physical Journal B, 95(11), Nov. 2022.
- Attention is all you need, 2017.
- A multiphase level set framework for image segmentation using the mumford and shah model. International Journal of Computer Vision, 50, 07 2004.
- Pyramid vision transformer: A versatile backbone for dense prediction without convolutions, 2021.
- Sensing prior constraints in deep neural networks for solving exploration geophysical problems. Proceedings of the National Academy of Sciences, 120(23):e2219573120, 2023.
- Segformer: Simple and efficient design for semantic segmentation with transformers, 2021.
- Multi-scale context aggregation by dilated convolutions, 2016.
- Segmentation transformer: Object-contextual representations for semantic segmentation, 2021.
- Resnest: Split-attention networks, 2020.
- Pyramid scene parsing network, 2017.
- Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers, 2021.
- Semantic understanding of scenes through the ade20k dataset, 2018.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.