Papers
Topics
Authors
Recent
Search
2000 character limit reached

SemiCD-VL: Visual-Language Model Guidance Makes Better Semi-supervised Change Detector

Published 8 May 2024 in cs.CV | (2405.04788v5)

Abstract: Change Detection (CD) aims to identify pixels with semantic changes between images. However, annotating massive numbers of pixel-level images is labor-intensive and costly, especially for multi-temporal images, which require pixel-wise comparisons by human experts. Considering the excellent performance of visual LLMs (VLMs) for zero-shot, open-vocabulary, etc. with prompt-based reasoning, it is promising to utilize VLMs to make better CD under limited labeled data. In this paper, we propose a VLM guidance-based semi-supervised CD method, namely SemiCD-VL. The insight of SemiCD-VL is to synthesize free change labels using VLMs to provide additional supervision signals for unlabeled data. However, almost all current VLMs are designed for single-temporal images and cannot be directly applied to bi- or multi-temporal images. Motivated by this, we first propose a VLM-based mixed change event generation (CEG) strategy to yield pseudo labels for unlabeled CD data. Since the additional supervised signals provided by these VLM-driven pseudo labels may conflict with the pseudo labels from the consistency regularization paradigm (e.g. FixMatch), we propose the dual projection head for de-entangling different signal sources. Further, we explicitly decouple the bi-temporal images semantic representation through two auxiliary segmentation decoders, which are also guided by VLM. Finally, to make the model more adequately capture change representations, we introduce metric-aware supervision by feature-level contrastive loss in auxiliary branches. Extensive experiments show the advantage of SemiCD-VL. For instance, SemiCD-VL improves the FixMatch baseline by +5.3 IoU on WHU-CD and by +2.4 IoU on LEVIR-CD with 5% labels. In addition, our CEG strategy, in an un-supervised manner, can achieve performance far superior to state-of-the-art un-supervised CD methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Street-view change detection with deconvolutional networks. Autonomous Robots 42, 1301–1322.
  2. Revisiting consistency regularization for semi-supervised change detection in remote sensing images. arXiv preprint arXiv:2204.08454 .
  3. A transformer-based siamese network for change detection, in: IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium, IEEE. pp. 207–210.
  4. End-to-end object detection with transformers, in: European conference on computer vision, Springer. pp. 213–229.
  5. Unsupervised change detection in satellite images using principal component analysis and k𝑘kitalic_k-means clustering. IEEE geoscience and remote sensing letters 6, 772–776.
  6. Semi-supervised learning (chapelle, o. et al., eds.; 2006) [book reviews]. IEEE Transactions on Neural Networks 20, 542–542. doi:10.1109/TNN.2009.2015974.
  7. Remote sensing image change detection with transformers. IEEE Transactions on Geoscience and Remote Sensing 60, 1–14.
  8. A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sensing 12, 1662.
  9. Fully convolutional siamese networks for change detection, in: 2018 25th IEEE International Conference on Image Processing (ICIP), IEEE. pp. 4063–4067.
  10. Decoupling zero-shot semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11583–11592.
  11. Unsupervised deep slow feature analysis for change detection in multi-temporal remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 57, 9976–9992.
  12. Convolutional neural network features based change detection in satellite images, in: First International Workshop on Pattern Recognition, SPIE. pp. 181–186.
  13. Changer: Feature interaction is what you need for change detection. IEEE Transactions on Geoscience and Remote Sensing .
  14. Snunet-cd: A densely connected siamese network for change detection of vhr images. IEEE Geoscience and Remote Sensing Letters 19, 1–5.
  15. Locating noise is halfway denoising for semi-supervised segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16612–16622.
  16. Scaling open-vocabulary image segmentation with image-level labels, in: European Conference on Computer Vision, Springer. pp. 540–557.
  17. Generative adversarial nets. Advances in neural information processing systems 27.
  18. Semivl: Semi-supervised semantic segmentation with vision-language guidance. arXiv preprint arXiv:2311.16241 .
  19. Adversarial learning for semi-supervised semantic segmentation. arXiv preprint arXiv:1802.07934 .
  20. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Transactions on geoscience and remote sensing 57, 574–586.
  21. Guided collaborative training for pixel-wise semi-supervised learning, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16, Springer. pp. 429–445.
  22. Segment anything, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4015–4026.
  23. Semi-supervised learning. CSZ2006. html 5.
  24. Hierarchical paired channel fusion network for street scene change detection. IEEE Transactions on Image Processing 30, 55–67.
  25. A new learning paradigm for foundation model-based remote-sensing change detection. IEEE Transactions on Geoscience and Remote Sensing 62, 1–12.
  26. Omg-seg: Is one model good enough for all segmentation? CVPR .
  27. Semi-supervised semantic segmentation with high-and low-level consistency. IEEE transactions on pattern analysis and machine intelligence 43, 1369–1379.
  28. Switching temporary teachers for semi-supervised semantic segmentation. Advances in Neural Information Processing Systems 36.
  29. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 .
  30. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 .
  31. Semi-supervised semantic segmentation with cross-consistency training, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12674–12684.
  32. Changesim: Towards end-to-end online scene change detection in industrial indoor environments, in: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE. pp. 8578–8585.
  33. Dual task learning by leveraging both dense correspondence and mis-correspondence for robust change detection with imperfect matches, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13749–13759.
  34. Semicdnet: A semisupervised convolutional neural network for change detection in high resolution remote-sensing images. IEEE Transactions on Geoscience and Remote Sensing 59, 5891–5906.
  35. Learning transferable visual models from natural language supervision, in: International conference on machine learning, PMLR. pp. 8748–8763.
  36. Unsupervised deep change vector analysis for multiple-change detection in vhr images. IEEE Transactions on Geoscience and Remote Sensing 57, 3677–3693.
  37. Self-calibrating anomaly and change detection for autonomous inspection robots, in: 2022 Sixth IEEE International Conference on Robotic Computing (IRC), IEEE. pp. 207–214.
  38. Aligning and prompting everything all at once for universal visual perception.
  39. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems 33, 596–608.
  40. Semi supervised semantic segmentation using generative adversarial network, in: Proceedings of the IEEE international conference on computer vision, pp. 5688–5696.
  41. Topological structural analysis of digitized binary images by border following. Computer vision, graphics, and image processing 30, 32–46.
  42. Segment change model (scm) for unsupervised change detection in vhr remote sensing images: a case study of buildings. arXiv preprint arXiv:2312.16410 .
  43. An unsupervised remote sensing change detection method based on multiscale graph convolutional network and metric learning. IEEE Transactions on Geoscience and Remote Sensing 60, 1–15.
  44. Temporal-agnostic change region proposal for semantic change detection. ISPRS Journal of Photogrammetry and Remote Sensing 204, 306–320.
  45. Changenet: A deep learning architecture for visual change detection, in: Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0.
  46. Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2517–2526.
  47. Semi-supervised semantic segmentation using unreliable pseudo-labels, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4248–4257.
  48. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems 34, 12077–12090.
  49. A simple baseline for open-vocabulary semantic segmentation with pre-trained vision-language model, in: European Conference on Computer Vision, Springer. pp. 736–753.
  50. Revisiting weak-to-strong consistency in semi-supervised semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7236–7246.
  51. St++: Make self-training work better for semi-supervised semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4268–4277.
  52. A survey on deep semi-supervised learning. IEEE Transactions on Knowledge and Data Engineering .
  53. Cutmix: Regularization strategy to train strong classifiers with localizable features, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 6023–6032.
  54. Changemask: Deep multi-task encoder-transformer-decoder architecture for semantic change detection. ISPRS Journal of Photogrammetry and Remote Sensing 183, 228–239.
  55. Segment any change. arXiv:2402.01188.
  56. Extract free dense labels from clip, in: European Conference on Computer Vision, Springer. pp. 696--712.
  57. Zegclip: Towards adapting clip for zero-shot semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11175--11185.
Citations (4)

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 0 likes about this paper.