SEER-ZSL: Semantic Encoder-Enhanced Representations for Generalized Zero-Shot Learning
Abstract: Zero-Shot Learning (ZSL) presents the challenge of identifying categories not seen during training. This task is crucial in domains where it is costly, prohibited, or simply not feasible to collect training data. ZSL depends on a mapping between the visual space and available semantic information. Prior works learn a mapping between spaces that can be exploited during inference. We contend, however, that the disparity between meticulously curated semantic spaces and the inherently noisy nature of real-world data remains a substantial and unresolved challenge. In this paper, we address this by introducing a Semantic Encoder-Enhanced Representations for Zero-Shot Learning (SEER-ZSL). We propose a hybrid strategy to address the generalization gap. First, we aim to distill meaningful semantic information using a probabilistic encoder, enhancing the semantic consistency and robustness. Second, we distill the visual space by exploiting the learned data distribution through an adversarially trained generator. Finally, we align the distilled information, enabling a mapping of unseen categories onto the true data manifold. We demonstrate empirically that this approach yields a model that outperforms the state-of-the-art benchmarks in terms of both generalization and benchmarks across diverse settings with small, medium, and large datasets. The complete code is available on GitHub.
- Zero-shot learning and its applications from autonomous vehicles to covid-19 diagnosis: A review. Intelligence-based medicine, 3:100005, 2020.
- Generative zero-shot learning via low-rank embedded semantic dictionary. IEEE transactions on pattern analysis and machine intelligence, 41(12):2861–2874, 2018.
- Learning to detect unseen object classes by between-class attribute transfer. In 2009 IEEE conference on computer vision and pattern recognition, pages 951–958. IEEE, 2009.
- Rethinking generative zero-shot learning: An ensemble learning perspective for recognising visual patches. In Proceedings of the 28th ACM International Conference on Multimedia, pages 3413–3421, 2020.
- A generative model for zero shot learning using conditional variational autoencoders. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 2188–2196, 2018.
- Semantics disentangling for generalized zero-shot learning. In Proceedings of the IEEE/CVF international conference on computer vision, pages 8712–8720, 2021.
- Devise: A deep visual-semantic embedding model. Advances in neural information processing systems, 26, 2013.
- Zero-shot learning through cross-modal transfer. Advances in neural information processing systems, 26, 2013.
- A simple exponential family framework for zero-shot learning. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2017, Skopje, Macedonia, September 18–22, 2017, Proceedings, Part II 10, pages 792–808. Springer, 2017.
- Free: Feature refinement for generalized zero-shot learning. In Proceedings of the IEEE/CVF international conference on computer vision, pages 122–131, 2021.
- Contrastive embedding for generalized zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2371–2381, 2021.
- Transferable contrastive network for generalized zero-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9765–9774, 2019.
- Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE transactions on pattern analysis and machine intelligence, 41(9):2251–2265, 2018.
- Logic explained networks. Artificial Intelligence, 314:103822, 2023.
- Generalized zero-shot learning via disentangled representation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 1966–1974, 2021.
- Learning attention as disentangler for compositional zero-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15315–15324, 2023.
- Hierarchical disentanglement of discriminative latent features for zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11467–11476, 2019.
- Duet: Cross-modal semantic grounding for contrastive zero-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 405–413, 2023.
- Domain-aware visual bias eliminating for generalized zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12664–12673, 2020.
- Variational lossy autoencoder. arXiv preprint arXiv:1611.02731, 2016.
- Manifold embedded joint geometrical and statistical alignment for visual domain adaptation. Knowledge-Based Systems, 257:109886, 2022.
- Autoencoding beyond pixels using a learned similarity metric. In International conference on machine learning, pages 1558–1566. PMLR, 2016.
- Deep feature consistent variational autoencoder. In 2017 IEEE winter conference on applications of computer vision (WACV), pages 1133–1141. IEEE, 2017.
- A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576, 2015.
- Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
- Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862, 2017.
- Sample complexity of testing the manifold hypothesis. Advances in neural information processing systems, 23, 2010.
- An integral projection-based semantic autoencoder for zero-shot learning. IEEE Access, 2023.
- Tom Halverson. Linear algebra with applications. The American Mathematical Monthly, 104(7):681, 1997.
- Relational knowledge transfer for zero-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30, 2016.
- Sun attribute database: Discovering, annotating, and recognizing scene attributes. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 2751–2758. IEEE, 2012.
- The caltech-ucsd birds-200-2011 dataset. 2011.
- Zero-shot learning-the good, the bad and the ugly. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4582–4591, 2017.
- Distributed representations of words and phrases and their compositionality. In C.J. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 26. Curran Associates, Inc., 2013.
- Vgse: Visually-grounded semantic embeddings for zero-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9316–9325, 2022.
- Multi-modal cycle-consistent generalized zero-shot learning. In Proceedings of the European conference on computer vision (ECCV), pages 21–37, 2018.
- Feature generating networks for zero-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5542–5551, 2018.
- f-vaegan-d2: A feature generating framework for any-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10275–10284, 2019.
- Leveraging the invariant side of generative zero-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7402–7411, 2019.
- Generalized zero-and few-shot learning via aligned variational autoencoders. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8247–8255, 2019.
- Generative dual adversarial network for generalized zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 801–810, 2019.
- Fine-grained generalized zero-shot learning via dense attribute-based attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4483–4493, 2020.
- Invertible zero-shot recognition flows. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16, pages 614–631. Springer, 2020.
- Generating diverse augmented attributes for generalized zero shot learning. Pattern Recognition Letters, 166:126–133, 2023.
- Semantic feature extraction for generalized zero-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 1166–1173, 2022.
- Deep learning. MIT press, 2016.
- Improved training of wasserstein gans. Advances in neural information processing systems, 30, 2017.
- Hyperbolic image embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6418–6428, 2020.
- A robust variational autoencoder using beta divergence. Knowledge-based systems, 238:107886, 2022.
- Unsupervised domain adaptation for zero-shot learning. In Proceedings of the IEEE international conference on computer vision, pages 2452–2460, 2015.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.