Papers
Topics
Authors
Recent
Search
2000 character limit reached

SeMAnD: Self-Supervised Anomaly Detection in Multimodal Geospatial Datasets

Published 26 Sep 2023 in cs.AI, cs.CV, and cs.LG | (2309.15245v1)

Abstract: We propose a Self-supervised Anomaly Detection technique, called SeMAnD, to detect geometric anomalies in Multimodal geospatial datasets. Geospatial data comprises of acquired and derived heterogeneous data modalities that we transform to semantically meaningful, image-like tensors to address the challenges of representation, alignment, and fusion of multimodal data. SeMAnD is comprised of (i) a simple data augmentation strategy, called RandPolyAugment, capable of generating diverse augmentations of vector geometries, and (ii) a self-supervised training objective with three components that incentivize learning representations of multimodal data that are discriminative to local changes in one modality which are not corroborated by the other modalities. Detecting local defects is crucial for geospatial anomaly detection where even small anomalies (e.g., shifted, incorrectly connected, malformed, or missing polygonal vector geometries like roads, buildings, landcover, etc.) are detrimental to the experience and safety of users of geospatial applications like mapping, routing, search, and recommendation systems. Our empirical study on test sets of different types of real-world geometric geospatial anomalies across 3 diverse geographical regions demonstrates that SeMAnD is able to detect real-world defects and outperforms domain-agnostic anomaly detection strategies by 4.8-19.7% as measured using anomaly classification AUC. We also show that model performance increases (i) up to 20.4% as the number of input modalities increase and (ii) up to 22.9% as the diversity and strength of training data augmentations increase.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (86)
  1. Tensorflow: A system for large-scale machine learning. In Osdi, Vol. 16. Savannah, GA, USA, 265–283.
  2. VATT: Transformers for multimodal self-supervised learning from raw video, audio and text. Advances in Neural Information Processing Systems 34 (2021).
  3. Self-supervised multimodal versatile networks. Proceedings of the 34th International Conference on Neural Information Processing Systems 2, 6 (2020), 7.
  4. Self-supervised learning by cross-modal audio-video clustering. Advances in Neural Information Processing Systems 33 (2020).
  5. Jinwon An and Sungzoon Cho. 2015. Variational autoencoder based anomaly detection using reconstruction probability. Special lecture on IE 2, 1 (2015), 1–18.
  6. Australian Research Data Commons. [n. d.]. Road casement polygons dataset. https://researchdata.edu.au/road-casement-polygon-vicmap-property/1739655.
  7. Clustering and unsupervised anomaly detection with l 2 normalized deep auto-encoder representations. In 2018 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–6.
  8. data2vec: A general framework for self-supervised learning in speech, vision and language. arXiv preprint arXiv:2202.03555 (2022).
  9. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence 41, 2 (2018), 423–443.
  10. A survey on deep multimodal learning for computer vision: Advances, trends, applications, and datasets. The Visual Computer (2021), 1–32.
  11. Liron Bergman and Yedid Hoshen. 2020. Classification-based anomaly detection for general data. arXiv preprint arXiv:2005.02359 (2020).
  12. Self-supervised temporal analysis of spatiotemporal data. arXiv preprint arXiv:2304.13143 (2023).
  13. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision. 9650–9660.
  14. Raghavendra Chalapathy and Sanjay Chawla. 2019. Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407 (2019).
  15. Anomaly detection using one-class neural networks. arXiv preprint arXiv:1802.06360 (2018).
  16. Anomaly detection: A survey. ACM computing surveys (CSUR) 41, 3 (2009), 1–58.
  17. Outlier detection with autoencoder ensembles. In Proceedings of the 2017 SIAM international conference on data mining. SIAM, 90–98.
  18. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597–1607.
  19. UNITER: Universal image-text representation learning. In European conference on computer vision. Springer, 104–120.
  20. Masked contrastive learning for anomaly detection. arXiv preprint arXiv:2105.08793 (2021).
  21. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 702–703.
  22. Terrance DeVries and Graham W Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017).
  23. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE international conference on computer vision. 1422–1430.
  24. EPSG Geodetic Parameter Registry. [n. d.]. EPSG:3857. https://epsg.io/3857.
  25. High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition 58 (2016), 121–134.
  26. ESRI. [n. d.]. Map of every pavement width in Great Britain helps local authorities with social distancing plans. https://www.esriuk.com/en-gb/news/press-releases/uk/39-map-of-every-pavement-width-in-great-britain.
  27. European Commission. [n. d.]. INSPIRE: Infrastructure for Spatial Information in Europe, Cadastral parcels datasets. https://inspire.ec.europa.eu/about-inspire/563.
  28. Apache Software Foundation. [n. d.]. Apache Spark. https://spark.apache.org/.
  29. Reachability Embeddings: Scalable self-supervised representation learning from mobility trajectories for multimodal geospatial computer vision. In 2022 23rd IEEE International Conference on Mobile Data Management (MDM). IEEE, 44–53.
  30. Zahra Ghafoori and Christopher Leckie. 2020. Deep multi-sphere support vector data description. In Proceedings of the 2020 SIAM International Conference on Data Mining. SIAM, 109–117.
  31. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018).
  32. Izhak Golan and Ran El-Yaniv. 2018. Deep anomaly detection using geometric transformations. Advances in neural information processing systems 31 (2018).
  33. Bootstrap your own latent: A new approach to self-supervised learning. arXiv preprint arXiv:2006.07733 (2020).
  34. Deep multimodal representation learning: A survey. IEEE Access 7 (2019), 63373–63394.
  35. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16000–16009.
  36. Identity mappings in deep residual networks. In European conference on computer vision. Springer, 630–645.
  37. Deep anomaly detection with outlier exposure. In International Conference on Learning Representations.
  38. Self-supervised anomaly detection: A survey and outlook. arXiv preprint arXiv:2205.05173 (2022).
  39. Perspectives on geospatial artificial intelligence platforms for multimodal spatiotemporal datasets. In Advances in Scalable and Intelligent Geospatial Analytics. CRC Press, 17–63.
  40. Trinity: A No-Code AI platform for complex spatial datasets. In Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery. 33–42.
  41. Perceiver: General perception with iterative attention. In International Conference on Machine Learning. PMLR, 4651–4664.
  42. ViLT: Vision-and-language transformer without convolution or region supervision. In International Conference on Machine Learning. PMLR, 5583–5594.
  43. Big transfer (BiT): General visual representation learning. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16. Springer, 491–507.
  44. Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 8943–8950.
  45. CutPaste: Self-supervised learning for anomaly detection and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9664–9674.
  46. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980–2988.
  47. Domain adaptation with randomized multilinear adversarial networks. ResearchGate (05 2017).
  48. Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
  49. ViLBERT: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In Advances in Neural Information Processing Systems, Vol. 32.
  50. Improving unimodal object recognition with multimodal contrastive learning. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 5656–5663.
  51. L. Moreira-Matias et al. 2013. Predicting taxi–passenger demand using streaming data. IEEE Trans. Intelligent Transportation Sys. 14 (2013), 1393–1402.
  52. Mehdi Noroozi and Paolo Favaro. 2016. Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision. Springer, 69–84.
  53. Coleman A O’Flaherty. 2018. Transport planning and traffic engineering. CRC Press.
  54. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
  55. OpenStreetMap. [n. d.]. Public GPS Traces. https://www.openstreetmap.org/traces.
  56. Andrew Owens and Alexei A Efros. 2018. Audio-visual scene analysis with self-supervised multisensory features. In Proceedings of the European Conference on Computer Vision (ECCV). 631–648.
  57. Deep learning for anomaly detection: A review. ACM Computing Surveys (CSUR) 54, 2 (2021), 1–38.
  58. Matthew Panzarino. 2018. Apple is rebuilding Maps from the ground up. TechCrunch (2018).
  59. Senthil Purushwalkam and Abhinav Gupta. 2020. Demystifying contrastive self-supervised learning: Invariances, augmentations and dataset biases. Advances in Neural Information Processing Systems 33 (2020), 3407–3418.
  60. Panda: Adapting pretrained features for anomaly detection and segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2806–2814.
  61. Anomaly detection requires better representations. In Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part IV. Springer, 56–68.
  62. Modeling the distribution of normal data in pre-trained deep features for anomaly detection. In 25th International Conference on Pattern Recognition (ICPR). IEEE, 6726–6733.
  63. Modeling the distribution of normal data in pre-trained deep features for anomaly detection. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 6726–6733.
  64. A unifying review of deep and shallow anomaly detection. Proc. IEEE 109, 5 (2021), 756–795.
  65. Deep one-class classification. In International conference on machine learning. PMLR, 4393–4402.
  66. A unified survey on anomaly, novelty, open-set, and out-of-distribution detection: Solutions and future challenges. arXiv preprint arXiv:2110.14051 (2021).
  67. SSD: A unified framework for self-supervised outlier detection. arXiv preprint arXiv:2103.12051 (2021).
  68. GradCAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.
  69. Kihyuk Sohn. 2016. Improved deep metric learning with multi-class n-pair loss objective. In Advances in neural information processing systems. 1857–1865.
  70. Learning and evaluating representations for deep one-class classification. arXiv preprint arXiv:2011.02578 (2020).
  71. Csi: Novelty detection via contrastive learning on distributionally shifted instances. Advances in neural information processing systems 33 (2020), 11839–11852.
  72. Multimodal self-supervised learning for medical image analysis. In International Conference on Information Processing in Medical Imaging. Springer, 661–673.
  73. Hao Tan and Mohit Bansal. 2019. LXMERT: Learning cross-modality encoder representations from transformers. arXiv preprint arXiv:1908.07490 (2019).
  74. David Martinus Johannes Tax. 2002. One-class classification: Concept learning in the absence of counter-examples. PhD Thesis, TU Delft (2002).
  75. Contrastive multiview coding. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 776–794.
  76. Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning. PMLR, 9929–9939.
  77. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3733–3742.
  78. VAE-Info-cGAN: Generating synthetic images by combining pixel-level and feature-level geospatial conditional inputs. In Proceedings of the 13th ACM SIGSPATIAL International Workshop on Computational Transportation Science. 1–10.
  79. Generalized out-of-distribution detection: A survey. arXiv preprint arXiv:2110.11334 (2021).
  80. Jing Yuan et al. 2010. T-Drive: Driving directions based on taxi trajectories. In ACM SIGSPATIAL.
  81. Adversarially learned anomaly detection. In 2018 IEEE International conference on data mining (ICDM). IEEE, 727–736.
  82. Deep structured energy based models for anomaly detection. In International conference on machine learning. PMLR, 1100–1109.
  83. Y. Zheng. 2015. Trajectory data mining: An overview. ACM TIST 6, 3 (2015), 1–41.
  84. Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 13001–13008.
  85. Chong Zhou and Randy C Paffenroth. 2017. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 665–674.
  86. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In International conference on learning representations.
Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.