- The paper presents a novel FGVC-Aircraft dataset containing 10,000 images of 100 aircraft variants, highlighting a unique hierarchical labeling system.
- It employs non-linear SVMs with a χ² kernel on dense SIFT features, revealing a challenging 48.69% variant classification accuracy among similar models.
- The study emphasizes dataset diversity and invites future work to extend its hierarchical approach to broader fine-grained visual classification applications.
Fine-Grained Visual Classification of Aircraft
Introduction
The paper "Fine-Grained Visual Classification of Aircraft" introduces the FGVC-Aircraft dataset, targeting fine-grained recognition of aircraft models. It presents a comprehensive dataset comprised of 10,000 images across 100 different aircraft variants, organized hierarchically. Aircraft present unique classification challenges compared to other FGVC domains, such as animals, due to factors like rigid structures, historical variations, and branding.
Dataset Composition
The FGVC-Aircraft dataset includes images capturing a wide spectrum of aircraft variants, structured into a three-level hierarchy: manufacturer, family, and variant. The dataset is annotated with bounding box data for dominant aircraft, facilitating focused FGVC tasks. With representations from 70 families and 30 manufacturers, the dataset aids in variant-specific identification, ensuring a well-rounded classification challenge (Figure 1).
Figure 1: Our dataset contains 100 variants of aircrafts shown above. These are also annotated with their family and manufacturer, as well as bounding boxes.
Data Collection and Annotation
The dataset was curated primarily through contributions from aircraft enthusiasts and spotters, leveraging extensive archives such as Airliners.net. Initial data collection yielded over 70,000 images, which were refined to remove biases and maximize diversity. Bounding boxes were annotated via crowdsourcing to ensure accuracy and efficiency, while hierarchical labels were manually verified against external sources for visual distinctions (Figure 2).

Figure 2: Label hierarchy shown as the manufacturer, family, and the variant. Our dataset contains aircrafts of 100 different variants grouped under 70 families and 30 manufacturers.
Baseline Classification and Challenges
The paper evaluates the dataset using a robust baseline classification model. It applies non-linear SVMs with a χ2 kernel, leveraging bag-of-visual-words models built from dense SIFT features. Despite robust methodology, the confusion matrix reveals high intra-family classification challenges, notably among similar aircraft within the Boeing and Airbus families (Figure 3).

Figure 3: Confusion matrix for the 100 variant classification challenge. Some high confusion, due to the similarity of the models are also shown. These correspond to the Boeing 737 family, Boeing 747 family, Airbus family, McDonnell Douglas (MD) and the Embraer family. The average diagonal accuracy is 48.69\%.
When analyzed across hierarchical classification tasks, the model demonstrates varying accuracy levels: 48.69% for variant, 58.48% for family, and 71.30% for manufacturer categories (Figure 4). Alternative strategies, such as training specifically at each hierarchy level, yielded inferior results, demonstrating the complexity of accurately capturing FGVC nuances.


Figure 4: Confusion matrix for the family (left) and manufacturer (right) classification tasks.
Implications and Future Work
This paper lays out a significant leap in FGVC research by introducing aircraft as a novel domain with unique variation characteristics. It underscores the necessity of both dataset diversity and robust classification techniques. Future work aims to expand the dataset with additional models and potentially extend the construction methodology to other object categories. This dataset holds promise to inform FGVC methodologies and enhance model robustness across domains.
Conclusion
The FGVC-Aircraft dataset represents a substantial contribution to the FGVC field, providing a challenging and diverse set of images for advancing visual classification methodologies. The paper highlights the potential for broadening FGVC applications through diverse object categorization and outlines the roadmap for future dataset enhancements and extensions.