Topological RANSAC for instance verification and retrieval without fine-tuning
Abstract: This paper presents an innovative approach to enhancing explainable image retrieval, particularly in situations where a fine-tuning set is unavailable. The widely-used SPatial verification (SP) method, despite its efficacy, relies on a spatial model and the hypothesis-testing strategy for instance recognition, leading to inherent limitations, including the assumption of planar structures and neglect of topological relations among features. To address these shortcomings, we introduce a pioneering technique that replaces the spatial model with a topological one within the RANSAC process. We propose bio-inspired saccade and fovea functions to verify the topological consistency among features, effectively circumventing the issues associated with SP's spatial model. Our experimental results demonstrate that our method significantly outperforms SP, achieving state-of-the-art performance in non-fine-tuning retrieval. Furthermore, our approach can enhance performance when used in conjunction with fine-tuned features. Importantly, our method retains high explainability and is lightweight, offering a practical and adaptable solution for a variety of real-world applications.
- Greyson Abid. Recognition and the perception–cognition divide. Mind & Language, 37(5):770–789, 2022.
- Building rome in a day. Communications of the ACM, 54(10):105–112, 2011.
- Hypergraph propagation and community selection for objects retrieval. Advances in Neural Information Processing Systems, 34, 2021.
- Graph-cut ransac. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6733–6741, 2018.
- Visual long-term memory has a massive storage capacity for object details. Proceedings of the National Academy of Sciences, 105(38):14325–14329, 2008.
- Unifying deep local and global features for image search. In European Conference on Computer Vision, pages 726–743. Springer, 2020.
- Performance evaluation of ransac family. Journal of Computer Vision, 24(3):271–300, 1997.
- Konstantinos G Derpanis. Overview of the ransac algorithm. Image Rochester NY, 4(1):2–3, 2010.
- Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 224–236, 2018.
- Temporally flexible feedback signal to foveal cortex for peripheral object recognition. Proceedings of the National Academy of Sciences, 113(41):11627–11632, 2016.
- Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
- Michael S Gazzaniga. The cognitive neurosciences. MIT press, 2009.
- Deep learning. MIT press, 2016.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Accurate visual memory for previously attended objects in natural scenes. Journal of Experimental Psychology: Human Perception and Performance, 28(1):113, 2002.
- Receptive fields of single neurones in the cat’s striate cortex. The Journal of physiology, 148(3):574, 1959.
- Mechanisms underlying development of visual maps and receptive fields. Annual review of neuroscience, 31:479, 2008.
- Aggregating local descriptors into a compact image representation. In 2010 IEEE computer society conference on computer vision and pattern recognition, pages 3304–3311. IEEE, 2010.
- Dxslam: A robust and efficient visual slam system with deep features. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4958–4965. IEEE, 2020.
- Modeling and recognition of landmark image collections using iconic scene graphs. In Computer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part I 10, pages 427–440. Springer, 2008.
- An integrative theory of prefrontal cortex function. Annual review of neuroscience, 24(1):167–202, 2001.
- Orb-slam: a versatile and accurate monocular slam system. IEEE transactions on robotics, 31(5):1147–1163, 2015.
- Large-scale image retrieval with attentive deep local features. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017.
- Large-scale image retrieval with attentive deep local features. In Proceedings of the IEEE international conference on computer vision, pages 3456–3465, 2017.
- Object retrieval with large vocabularies and fast spatial matching. In 2007 IEEE conference on computer vision and pattern recognition, pages 1–8. IEEE, 2007.
- Lost in quantization: Improving particular object retrieval in large scale image databases. In 2008 IEEE conference on computer vision and pattern recognition, pages 1–8. IEEE, 2008.
- Revisiting oxford and paris: Large-scale image retrieval benchmarking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5706–5715, 2018.
- Fine-tuning cnn image retrieval with no human annotation. IEEE transactions on pattern analysis and machine intelligence, 41(7):1655–1668, 2018.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- Improved geometric verification for large scale landmark image collections. In BMVC, pages 1–11, 2012.
- Learning with average precision: Training image retrieval with a listwise loss. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5107–5116, 2019.
- Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- A vote-and-verify strategy for fast spatial verification in image retrieval. In Asian Conference on Computer Vision (ACCV), 2016.
- Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision (ECCV), 2016.
- Object retrieval and localization with spatially-constrained similarity measure and k-nn re-ranking. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 3013–3020. IEEE, 2012.
- Zhongzhi Shi. Intelligence science: Leading the age of intelligence. Elsevier, 2021.
- Scene coordinate regression forests for camera relocalization in rgb-d images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2930–2937, 2013.
- Perception and memory for pictures: Single-trial learning of 2500 visual stimuli. Psychonomic science, 19(2):73–74, 1970.
- Instance-level image retrieval using reranking transformers. In proceedings of the IEEE/CVF international conference on computer vision, pages 12105–12115, 2021.
- Locality in generic instance search from one example. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2091–2098, 2014.
- To aggregate or not to aggregate: Selective match kernels for image search. In Proceedings of the IEEE International Conference on Computer Vision, pages 1401–1408, 2013.
- Particular object retrieval with integral max-pooling of cnn activations. arXiv preprint arXiv:1511.05879, 2015.
- Visual field maps in human cortex. Neuron, 56(2):366–383, 2007.
- Learning super-features for image retrieval. arXiv preprint arXiv:2201.13182, 2022.
- Google landmarks dataset v2-a large-scale benchmark for instance-level recognition and retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2575–2584, 2020.
- Guided search: an alternative to the feature integration model for visual search. Journal of Experimental Psychology: Human perception and performance, 15(3):419, 1989.
- Learning deep local features with multiple dynamic attentions for large-scale image retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11416–11425, 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.