Incremental Few-Shot Learning with Attention Attractor Networks

Published 16 Oct 2018 in cs.LG, cs.CV, and stat.ML | (1810.07218v3)

Abstract: Machine learning classifiers are often trained to recognize a set of pre-defined classes. However, in many applications, it is often desirable to have the flexibility of learning additional concepts, with limited data and without re-training on the full training set. This paper addresses this problem, incremental few-shot learning, where a regular classification network has already been trained to recognize a set of base classes, and several extra novel classes are being considered, each with only a few labeled examples. After learning the novel classes, the model is then evaluated on the overall classification performance on both base and novel classes. To this end, we propose a meta-learning model, the Attention Attractor Network, which regularizes the learning of novel classes. In each episode, we train a set of new weights to recognize novel classes until they converge, and we show that the technique of recurrent back-propagation can back-propagate through the optimization process and facilitate the learning of these parameters. We demonstrate that the learned attractor network can help recognize novel classes while remembering old classes without the need to review the original training set, outperforming various baselines.

Abstract PDF Upgrade to Chat

Authors (4)

Citations (178)

View on Semantic Scholar

Summary

The paper presents an Attention Attractor Network that integrates novel class data with few examples while preserving base class knowledge.
It employs meta-learning and recurrent back-propagation to minimize catastrophic forgetting, outperforming baselines on mini-ImageNet and tiered-ImageNet.
The method enables efficient incremental adaptation in image classification tasks, demonstrating robust performance in limited data scenarios.

Incremental Few-Shot Learning with Attention Attractor Networks: A Comprehensive Evaluation

Introduction

The paper presents a novel approach to addressing incremental few-shot learning by introducing the Attention Attractor Network. This method enhances the learning of novel classes with a few labeled examples while retaining knowledge of the base classes, without the need to revisit the original training dataset. The work predominantly focuses on resolving the challenge of catastrophic forgetting, a persistent issue in incremental learning scenarios. By leveraging a rigorous meta-learning framework and recurrent back-propagation (RBP), this research aims to achieve state-of-the-art performance on image classification tasks.

Research Motivation and Problem Definition

Traditional machine learning models are limited in their ability to adapt to new class information without comprehensive retraining. This constraint is particularly evident in scenarios where new class instances are sparse, necessitating the need for incremental few-shot learning solutions. The paper explores this domain by considering a classification network initially trained on a set of base classes, with the objective of incorporating new novel classes presented with only a few examples. The approach ensures that overall classification accuracy across both base and novel classes does not deteriorate.

Proposed Methodology

The central innovation of this work is the Attention Attractor Network, which serves as a meta-learning model designed to regularize the learning of novel classes. The network dynamically computes an attractor regularizing term that helps maintain the learned knowledge of base classes while efficiently integrating novel class information. The method involves:

Meta-learning Stage: Learning meta-parameters governing the attractor network to facilitate joint classification.
Episodic Training: Employing a cyclic training process where each episode considers a few-shot learning task.
Recurrent Back-Propagation (RBP): Utilized to back-propagate through the few-shot optimization process, ensuring the network efficiently learns to adapt without direct access to base class data.

The attractor network computes a regularizer by attending to base classes, thereby reducing catastrophic forgetting when novel information is introduced.

Key Findings and Results

Through rigorous experimentation on benchmark datasets like mini-ImageNet and tiered-ImageNet, the paper demonstrates that the proposed model significantly outperforms existing baselines, including prototypical networks and learning-without-forgetting approaches. Key numerical results indicate:

Enhanced accuracy in incremental few-shot learning settings across both datasets.
Substantial reductions in classification accuracy degradation when jointly predicting base and novel classes.

The use of RBP further contributes to the efficiency and effectiveness of the model, offering a solution that converges well compared to truncated back-propagation through time (T-BPTT).

Implications and Future Work

This research contributes to the advancement of machine learning methods capable of dynamic class expansion in limited data regimes. The methodology has practical implications in areas like personalized visual recognition systems, where flexibility in learning novel objects is crucial. From a theoretical perspective, the study reinforces the utility of meta-learning coupled with dynamic attractor networks to mitigate forgetting.

Future investigations could focus on extending this approach to more complex hierarchical memory systems and exploring its applicability in sequential learning tasks. Additionally, integrating continual learning mechanisms with memory consolidation might enhance the robustness of the proposed model in dynamic environments.

In summary, this paper presents a well-founded approach to tackling incremental few-shot learning, offering a promising direction for future research and application in the field of adaptive learning systems.

Markdown Report Issue