On Model-Free Re-ranking for Visual Place Recognition with Deep Learned Local Features
Abstract: Re-ranking is the second stage of a visual place recognition task, in which the system chooses the best-matching images from a pre-selected subset of candidates. Model-free approaches compute the image pair similarity based on a spatial comparison of corresponding local visual features, eliminating the need for computationally expensive estimation of a model describing transformation between images. The article focuses on model-free re-ranking based on standard local visual features and their applicability in long-term autonomy systems. It introduces three new model-free re-ranking methods that were designed primarily for deep-learned local visual features. These features evince high robustness to various appearance changes, which stands as a crucial property for use with long-term autonomy systems. All the introduced methods were employed in a new visual place recognition system together with the D2-net feature detector (Dusmanu, 2019) and experimentally tested with diverse, challenging public datasets. The obtained results are on par with current state-of-the-art methods, affirming that model-free approaches are a viable and worthwhile path for long-term visual place recognition.
- C. Masone and B. Caputo, “A survey on deep visual place recognition,” IEEE Access, vol. 9, pp. 19 516–19 547, 2021.
- L. G. Camara, T. Pivoňka, M. Jílek, C. Gäbert, K. Košnar, and L. Přeučil, “Accurate and robust teach and repeat navigation by visual place recognition: A cnn approach,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 6018–6024.
- X. Li, M. Larson, and A. Hanjalic, “Pairwise geometric matching for large-scale object retrieval,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 5153–5161.
- S. Hausler, S. Garg, M. Xu, M. Milford, and T. Fischer, “Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 14 136–14 147.
- Y. Zhang, Z. Jia, and T. Chen, “Image retrieval with geometry-preserving visual phrases,” in 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011, pp. 809–816.
- L. G. Camara and L. Přeučil, “Visual place recognition by spatial matching of high-level cnn features,” Robotics and Autonomous Systems, vol. 133, p. 103625, 2020.
- M. Dusmanu, I. Rocco, T. Pajdla, M. Pollefeys, J. Sivic, A. Torii, and T. Sattler, “D2-net: A trainable cnn for joint description and detection of local features,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8084–8093.
- G. Barbarani, M. Mostafa, H. Bayramov, G. Trivigno, G. Berton, C. Masone, and B. Caputo, “Are local features all you need for cross-domain visual place recognition?” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2023, pp. 6155–6165.
- A. Ali-Bey, B. Chaib-Draa, and P. Giguére, “Mixvpr: Feature mixing for visual place recognition,” in 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 2997–3006.
- H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local descriptors into a compact image representation,” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 3304–3311.
- R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, “Netvlad: Cnn architecture for weakly supervised place recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 5297–5307.
- N. V. Keetha, M. Milford, and S. Garg, “A hierarchical dual model of environment- and place-specific utility for visual place recognition,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 6969–6976, 2021.
- D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superpoint: Self-supervised interest point detection and description,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 337–33 712.
- P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superglue: Learning feature matching with graph neural networks,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 4937–4946.
- F. Radenović, G. Tolias, and O. Chum, “Fine-tuning cnn image retrieval with no human annotation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 7, pp. 1655–1668, 2019.
- G. Berton, C. Masone, and B. Caputo, “Rethinking visual geo-localization for large-scale applications,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 4868–4878.
- F. Lu, L. Zhang, X. Lan, S. Dong, Y. Wang, and C. Yuan, “Towards seamless adaptation of pre-trained models for visual place recognition,” in The Twelfth International Conference on Learning Representations, 2024.
- D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.
- Y. Avrithis and G. Tolias, “Hough pyramid matching,” International Journal of Computer Vision, vol. 107, no. 1, pp. 1–19, 2014. [Online]. Available: http://link.springer.com/10.1007/s11263-013-0659-3
- X. Shen, Z. Lin, J. Brandt, S. Avidan, and Y. Wu, “Object retrieval and localization with spatially-constrained similarity measure and k-nn re-ranking,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012, pp. 3013–3020.
- S. Garg, N. Suenderhauf, and M. Milford, “Lost? appearance-invariant place recognition for opposite viewpoints using visual semantics,” Proceedings of Robotics: Science and Systems XIV, 2018.
- Z. Li and N. Snavely, “Megadepth: Learning single-view depth prediction from internet photos,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 2041–2050.
- T. Sattler, W. Maddern, C. Toft, A. Torii, L. Hammarstrand, E. Stenborg, D. Safari, M. Okutomi, M. Pollefeys, J. Sivic, F. Kahl, and T. Pajdla, “Benchmarking 6dof outdoor visual localization in changing conditions,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8601–8610.
- W. Maddern, G. Pascoe, C. Linegar, and P. Newman, “1 Year, 1000km: The Oxford RobotCar Dataset,” The International Journal of Robotics Research (IJRR), vol. 36, no. 1, pp. 3–15, 2017.
- (2024) Mapillary. [Online]. Available: https://www.mapillary.com
- Z. Chen, F. Maffra, I. Sa, and M. Chli, “Only look once, mining distinctive landmarks from convnet for visual place recognition,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 9–16.
- A. Glover, “Day and night, left and right,” Mar. 2014. [Online]. Available: https://doi.org/10.5281/zenodo.4590133
- S. Skrede. (2013) Nordlandsbanen: minute by minute, season by season. [Online]. Available: https://nrkbeta.no/2013/01/15/nordlandsbanen-minute-by-minute-season-by-season
- P. Neubert, N. Sünderhauf, and P. Protzel, “Superpixel-based appearance change prediction for long-term navigation across seasons,” Robotics and Autonomous Systems, vol. 69, pp. 15–27, 2015, selected papers from 6th European Conference on Mobile Robots.
- X. Zhao, X. Wu, W. Chen, P. C. Y. Chen, Q. Xu, and Z. Li, “Aliked: A lighter keypoint and descriptor extraction network via deformable transformation,” IEEE Transactions on Instrumentation & Measurement, vol. 72, pp. 1–16, 2023. [Online]. Available: https://arxiv.org/pdf/2304.03608.pdf
- A. Torii, J. Sivic, T. Pajdla, and M. Okutomi, “Visual place recognition with repetitive structures,” in 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 883–890.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.