Structure-Aware Residual-Center Representation for Self-Supervised Open-Set 3D Cross-Modal Retrieval
Abstract: Existing methods of 3D cross-modal retrieval heavily lean on category distribution priors within the training set, which diminishes their efficacy when tasked with unseen categories under open-set environments. To tackle this problem, we propose the Structure-Aware Residual-Center Representation (SRCR) framework for self-supervised open-set 3D cross-modal retrieval. To address the center deviation due to category distribution differences, we utilize the Residual-Center Embedding (RCE) for each object by nested auto-encoders, rather than directly mapping them to the modality or category centers. Besides, we perform the Hierarchical Structure Learning (HSL) approach to leverage the high-order correlations among objects for generalization, by constructing a heterogeneous hypergraph structure based on hierarchical inter-modality, intra-object, and implicit-category correlations. Extensive experiments and ablation studies on four benchmarks demonstrate the superiority of our proposed framework compared to state-of-the-art methods.
- “Cross-modal Center Loss for 3D Cross-Modal Retrieval,” in CVPR, 2021, pp. 3142–3151.
- “Adversarial Cross-modal Retrieval,” in ACMMM, 2017, pp. 154–162.
- “RONO: Robust Discriminative Learning With Noisy Labels for 2D-3D Cross-Modal Retrieval,” in CVPR, 2023, pp. 11610–11619.
- “Deep Canonical Correlation Analysis,” in ICML. PMLR, 2013, pp. 1247–1255.
- “On Deep Multi-View Representation Learning,” in ICML. PMLR, 2015, pp. 1083–1092.
- “Open-Set Recognition: A Good Closed-Set Classifier is All You Need?,” arXiv preprint arXiv:2110.06207, 2021.
- “Adversarial Reciprocal Points Learning for Open Set Recognition,” TPAMI, vol. 44, no. 11, pp. 8065–8081, 2021.
- “Hypergraph-based Multi-Modal Representation for Open-Set 3D Object Retrieval,” TPAMI, , no. 01, pp. 1–18, 2023.
- “Cross-Modal Retrieval with Correspondence AutoEncoder,” in ACMMM, 2014, pp. 7–16.
- “HGNN+: General Hypergraph Neural Networks,” TPAMI, vol. 45, no. 3, pp. 3181–3199, 2022.
- “ABO: Dataset and Benchmarks for Real-World 3D Object Understanding,” in CVPR, 2022, pp. 21126–21136.
- “On Visual Similarity based 3D Model Retrieval,” in Computer graphics forum. Wiley Online Library, 2003, pp. 223–232.
- “Developing an Engineering Shape Benchmark for CAD Models,” Computer-Aided Design, vol. 38, no. 9, pp. 939–953, 2006.
- “3D Shapenets: A Deep Representation for Volumetric Shapes,” in CVPR, 2015, pp. 1912–1920.
- “Scalable Deep Multimodal Learning for Cross-Modal Retrieval,” in SIGIR, 2019, pp. 635–644.
- “Multi-Modal Semantic AutoEncoder for Cross-Modal Retrieval,” Neurocomputing, vol. 331, pp. 165–175, 2019.
- “Learning Placeholders for Open-Set Recognition,” in CVPR, 2021, pp. 4401–4410.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.