PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category Discovery

Published 11 Dec 2022 in cs.CV | (2212.05590v2)

Abstract: Although existing semi-supervised learning models achieve remarkable success in learning with unannotated in-distribution data, they mostly fail to learn on unlabeled data sampled from novel semantic classes due to their closed-set assumption. In this work, we target a pragmatic but under-explored Generalized Novel Category Discovery (GNCD) setting. The GNCD setting aims to categorize unlabeled training data coming from known and novel classes by leveraging the information of partially labeled known classes. We propose a two-stage Contrastive Affinity Learning method with auxiliary visual Prompts, dubbed PromptCAL, to address this challenging problem. Our approach discovers reliable pairwise sample affinities to learn better semantic clustering of both known and novel classes for the class token and visual prompts. First, we propose a discriminative prompt regularization loss to reinforce semantic discriminativeness of prompt-adapted pre-trained vision transformer for refined affinity relationships.Besides, we propose contrastive affinity learning to calibrate semantic representations based on our iterative semi-supervised affinity graph generation method for semantically-enhanced supervision. Extensive experimental evaluation demonstrates that our PromptCAL method is more effective in discovering novel classes even with limited annotations and surpasses the current state-of-the-art on generic and fine-grained benchmarks (e.g., with nearly 11% gain on CUB-200, and 9% on ImageNet-100) on overall accuracy. Our code is available at https://github.com/sheng-eatamath/PromptCAL.

Abstract PDF Upgrade to Chat

Citations (48)

View on Semantic Scholar

Summary

The paper introduces PromptCAL, combining prompt tuning with contrastive learning to accurately discover novel categories.
It employs a Discriminative Prompt Regularization module within a vision transformer to enhance semantic representations.
Extensive experiments demonstrate significant gains, including an 11% improvement on fine-grained datasets like CUB-200.

An Overview of "PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category Discovery"

The paper "PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category Discovery" addresses the Generalized Novel Category Discovery (GNCD) challenge by proposing a method named PromptCAL. This paper seeks to improve the categorization of unlabeled training data that consists of both known and novel classes, utilizing partially labeled known classes for guidance—a scenario often encountered in real-world applications.

The state-of-the-art semi-supervised learning models typically fail in such scenarios due to their closed-set assumption. PromptCAL tackles this by relaxing these assumptions and introducing a two-stage learning framework based on contrastive affinity learning and auxiliary prompts within a vision transformer architecture.

Methodology

PromptCAL leverages two synergistic learning objectives:

Discriminative Prompt Regularization (DPR): By tuning visual prompts with a vision transformer backbone, this module enforces discriminative semantic representations, which enables the model to adapt better to novel categories. This is achieved by utilizing an auxiliary supervision on prompts, enhancing their adaptability without overfitting.
Contrastive Affinity Learning (CAL): This stage focuses on discovering reliable pairwise sample affinities through a semi-supervised affinity graph generation approach. It constructs a consensus KNN graph, facilitates affinity propagation, and incorporates priors from labeled samples to generate robust pseudo positives/negatives. By employing dynamic sub-graph sampling and contrastive learning, the model iteratively enhances semantic representation.

Results

The efficacy of PromptCAL is demonstrated on six benchmarks, including both generic and fine-grained datasets. The empirical results show that PromptCAL consistently surpasses existing methods, including GCD and ORCA, notably improving novel category discovery with significant performance gains particularly on fine-grained datasets like CUB-200 and StanfordCars. For instance, on the CUB-200 dataset, PromptCAL delivers an approximately 11% gain over previous methods.

Implications and Future Directions

PromptCAL exemplifies an innovative approach by synergistically integrating prompt tuning with contrastive learning. Its demonstrated ability to accurately discover novel categories even under challenging scenarios has practical implications for real-world AI deployment, such as in automated labeling systems where new semantic classes emerge routinely.

The methodological advances made in PromptCAL also provide a foundational basis for future research in adaptive and self-supervised learning models. Exploring further integration of prompt learning with other advanced neural architectures, utilizing larger and more diverse datasets, and enhancing model scalability could be promising directions.

In conclusion, the proposed PromptCAL framework enriches the landscape of GNCD by presenting a robust, scalable solution that challenges traditional closed-set assumptions, setting a new benchmark in the quest for more generalized and inclusive AI systems.

Markdown Report Issue