Fine-Grained Visual Classification of Aircraft

Published 21 Jun 2013 in cs.CV | (1306.5151v1)

Abstract: This paper introduces FGVC-Aircraft, a new dataset containing 10,000 images of aircraft spanning 100 aircraft models, organised in a three-level hierarchy. At the finer level, differences between models are often subtle but always visually measurable, making visual recognition challenging but possible. A benchmark is obtained by defining corresponding classification tasks and evaluation protocols, and baseline results are presented. The construction of this dataset was made possible by the work of aircraft enthusiasts, a strategy that can extend to the study of number of other object classes. Compared to the domains usually considered in fine-grained visual classification (FGVC), for example animals, aircraft are rigid and hence less deformable. They, however, present other interesting modes of variation, including purpose, size, designation, structure, historical style, and branding.

Abstract PDF Upgrade to Chat

Citations (1,975)

View on Semantic Scholar

Summary

The paper presents a novel FGVC-Aircraft dataset containing 10,000 images of 100 aircraft variants, highlighting a unique hierarchical labeling system.
It employs non-linear SVMs with a χ² kernel on dense SIFT features, revealing a challenging 48.69% variant classification accuracy among similar models.
The study emphasizes dataset diversity and invites future work to extend its hierarchical approach to broader fine-grained visual classification applications.

Fine-Grained Visual Classification of Aircraft

Introduction

The paper "Fine-Grained Visual Classification of Aircraft" introduces the FGVC-Aircraft dataset, targeting fine-grained recognition of aircraft models. It presents a comprehensive dataset comprised of 10,000 images across 100 different aircraft variants, organized hierarchically. Aircraft present unique classification challenges compared to other FGVC domains, such as animals, due to factors like rigid structures, historical variations, and branding.

Dataset Composition

The FGVC-Aircraft dataset includes images capturing a wide spectrum of aircraft variants, structured into a three-level hierarchy: manufacturer, family, and variant. The dataset is annotated with bounding box data for dominant aircraft, facilitating focused FGVC tasks. With representations from 70 families and 30 manufacturers, the dataset aids in variant-specific identification, ensuring a well-rounded classification challenge (Figure 1).

Figure 1: Our dataset contains 100 variants of aircrafts shown above. These are also annotated with their family and manufacturer, as well as bounding boxes.

Data Collection and Annotation

The dataset was curated primarily through contributions from aircraft enthusiasts and spotters, leveraging extensive archives such as Airliners.net. Initial data collection yielded over 70,000 images, which were refined to remove biases and maximize diversity. Bounding boxes were annotated via crowdsourcing to ensure accuracy and efficiency, while hierarchical labels were manually verified against external sources for visual distinctions (Figure 2).

Figure 2: Label hierarchy shown as the manufacturer, family, and the variant. Our dataset contains aircrafts of 100 different variants grouped under 70 families and 30 manufacturers.

Baseline Classification and Challenges

The paper evaluates the dataset using a robust baseline classification model. It applies non-linear SVMs with a $\chi^2$ kernel, leveraging bag-of-visual-words models built from dense SIFT features. Despite robust methodology, the confusion matrix reveals high intra-family classification challenges, notably among similar aircraft within the Boeing and Airbus families (Figure 3).

Figure 3: Confusion matrix for the 100 variant classification challenge. Some high confusion, due to the similarity of the models are also shown. These correspond to the Boeing 737 family, Boeing 747 family, Airbus family, McDonnell Douglas (MD) and the Embraer family. The average diagonal accuracy is 48.69\%.

When analyzed across hierarchical classification tasks, the model demonstrates varying accuracy levels: 48.69% for variant, 58.48% for family, and 71.30% for manufacturer categories (Figure 4). Alternative strategies, such as training specifically at each hierarchy level, yielded inferior results, demonstrating the complexity of accurately capturing FGVC nuances.

Figure 4: Confusion matrix for the family (left) and manufacturer (right) classification tasks.

Implications and Future Work

This paper lays out a significant leap in FGVC research by introducing aircraft as a novel domain with unique variation characteristics. It underscores the necessity of both dataset diversity and robust classification techniques. Future work aims to expand the dataset with additional models and potentially extend the construction methodology to other object categories. This dataset holds promise to inform FGVC methodologies and enhance model robustness across domains.

Conclusion

The FGVC-Aircraft dataset represents a substantial contribution to the FGVC field, providing a challenging and diverse set of images for advancing visual classification methodologies. The paper highlights the potential for broadening FGVC applications through diverse object categorization and outlines the roadmap for future dataset enhancements and extensions.

Markdown Report Issue