SemiCD-VL: Visual-Language Model Guidance Makes Better Semi-supervised Change Detector
Abstract: Change Detection (CD) aims to identify pixels with semantic changes between images. However, annotating massive numbers of pixel-level images is labor-intensive and costly, especially for multi-temporal images, which require pixel-wise comparisons by human experts. Considering the excellent performance of visual LLMs (VLMs) for zero-shot, open-vocabulary, etc. with prompt-based reasoning, it is promising to utilize VLMs to make better CD under limited labeled data. In this paper, we propose a VLM guidance-based semi-supervised CD method, namely SemiCD-VL. The insight of SemiCD-VL is to synthesize free change labels using VLMs to provide additional supervision signals for unlabeled data. However, almost all current VLMs are designed for single-temporal images and cannot be directly applied to bi- or multi-temporal images. Motivated by this, we first propose a VLM-based mixed change event generation (CEG) strategy to yield pseudo labels for unlabeled CD data. Since the additional supervised signals provided by these VLM-driven pseudo labels may conflict with the pseudo labels from the consistency regularization paradigm (e.g. FixMatch), we propose the dual projection head for de-entangling different signal sources. Further, we explicitly decouple the bi-temporal images semantic representation through two auxiliary segmentation decoders, which are also guided by VLM. Finally, to make the model more adequately capture change representations, we introduce metric-aware supervision by feature-level contrastive loss in auxiliary branches. Extensive experiments show the advantage of SemiCD-VL. For instance, SemiCD-VL improves the FixMatch baseline by +5.3 IoU on WHU-CD and by +2.4 IoU on LEVIR-CD with 5% labels. In addition, our CEG strategy, in an un-supervised manner, can achieve performance far superior to state-of-the-art un-supervised CD methods.
- Street-view change detection with deconvolutional networks. Autonomous Robots 42, 1301–1322.
- Revisiting consistency regularization for semi-supervised change detection in remote sensing images. arXiv preprint arXiv:2204.08454 .
- A transformer-based siamese network for change detection, in: IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium, IEEE. pp. 207–210.
- End-to-end object detection with transformers, in: European conference on computer vision, Springer. pp. 213–229.
- Unsupervised change detection in satellite images using principal component analysis and k𝑘kitalic_k-means clustering. IEEE geoscience and remote sensing letters 6, 772–776.
- Semi-supervised learning (chapelle, o. et al., eds.; 2006) [book reviews]. IEEE Transactions on Neural Networks 20, 542–542. doi:10.1109/TNN.2009.2015974.
- Remote sensing image change detection with transformers. IEEE Transactions on Geoscience and Remote Sensing 60, 1–14.
- A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sensing 12, 1662.
- Fully convolutional siamese networks for change detection, in: 2018 25th IEEE International Conference on Image Processing (ICIP), IEEE. pp. 4063–4067.
- Decoupling zero-shot semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11583–11592.
- Unsupervised deep slow feature analysis for change detection in multi-temporal remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 57, 9976–9992.
- Convolutional neural network features based change detection in satellite images, in: First International Workshop on Pattern Recognition, SPIE. pp. 181–186.
- Changer: Feature interaction is what you need for change detection. IEEE Transactions on Geoscience and Remote Sensing .
- Snunet-cd: A densely connected siamese network for change detection of vhr images. IEEE Geoscience and Remote Sensing Letters 19, 1–5.
- Locating noise is halfway denoising for semi-supervised segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16612–16622.
- Scaling open-vocabulary image segmentation with image-level labels, in: European Conference on Computer Vision, Springer. pp. 540–557.
- Generative adversarial nets. Advances in neural information processing systems 27.
- Semivl: Semi-supervised semantic segmentation with vision-language guidance. arXiv preprint arXiv:2311.16241 .
- Adversarial learning for semi-supervised semantic segmentation. arXiv preprint arXiv:1802.07934 .
- Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Transactions on geoscience and remote sensing 57, 574–586.
- Guided collaborative training for pixel-wise semi-supervised learning, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16, Springer. pp. 429–445.
- Segment anything, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4015–4026.
- Semi-supervised learning. CSZ2006. html 5.
- Hierarchical paired channel fusion network for street scene change detection. IEEE Transactions on Image Processing 30, 55–67.
- A new learning paradigm for foundation model-based remote-sensing change detection. IEEE Transactions on Geoscience and Remote Sensing 62, 1–12.
- Omg-seg: Is one model good enough for all segmentation? CVPR .
- Semi-supervised semantic segmentation with high-and low-level consistency. IEEE transactions on pattern analysis and machine intelligence 43, 1369–1379.
- Switching temporary teachers for semi-supervised semantic segmentation. Advances in Neural Information Processing Systems 36.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 .
- Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 .
- Semi-supervised semantic segmentation with cross-consistency training, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12674–12684.
- Changesim: Towards end-to-end online scene change detection in industrial indoor environments, in: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE. pp. 8578–8585.
- Dual task learning by leveraging both dense correspondence and mis-correspondence for robust change detection with imperfect matches, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13749–13759.
- Semicdnet: A semisupervised convolutional neural network for change detection in high resolution remote-sensing images. IEEE Transactions on Geoscience and Remote Sensing 59, 5891–5906.
- Learning transferable visual models from natural language supervision, in: International conference on machine learning, PMLR. pp. 8748–8763.
- Unsupervised deep change vector analysis for multiple-change detection in vhr images. IEEE Transactions on Geoscience and Remote Sensing 57, 3677–3693.
- Self-calibrating anomaly and change detection for autonomous inspection robots, in: 2022 Sixth IEEE International Conference on Robotic Computing (IRC), IEEE. pp. 207–214.
- Aligning and prompting everything all at once for universal visual perception.
- Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems 33, 596–608.
- Semi supervised semantic segmentation using generative adversarial network, in: Proceedings of the IEEE international conference on computer vision, pp. 5688–5696.
- Topological structural analysis of digitized binary images by border following. Computer vision, graphics, and image processing 30, 32–46.
- Segment change model (scm) for unsupervised change detection in vhr remote sensing images: a case study of buildings. arXiv preprint arXiv:2312.16410 .
- An unsupervised remote sensing change detection method based on multiscale graph convolutional network and metric learning. IEEE Transactions on Geoscience and Remote Sensing 60, 1–15.
- Temporal-agnostic change region proposal for semantic change detection. ISPRS Journal of Photogrammetry and Remote Sensing 204, 306–320.
- Changenet: A deep learning architecture for visual change detection, in: Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0.
- Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2517–2526.
- Semi-supervised semantic segmentation using unreliable pseudo-labels, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4248–4257.
- Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems 34, 12077–12090.
- A simple baseline for open-vocabulary semantic segmentation with pre-trained vision-language model, in: European Conference on Computer Vision, Springer. pp. 736–753.
- Revisiting weak-to-strong consistency in semi-supervised semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7236–7246.
- St++: Make self-training work better for semi-supervised semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4268–4277.
- A survey on deep semi-supervised learning. IEEE Transactions on Knowledge and Data Engineering .
- Cutmix: Regularization strategy to train strong classifiers with localizable features, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 6023–6032.
- Changemask: Deep multi-task encoder-transformer-decoder architecture for semantic change detection. ISPRS Journal of Photogrammetry and Remote Sensing 183, 228–239.
- Segment any change. arXiv:2402.01188.
- Extract free dense labels from clip, in: European Conference on Computer Vision, Springer. pp. 696--712.
- Zegclip: Towards adapting clip for zero-shot semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11175--11185.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.