Papers
Topics
Authors
Recent
Search
2000 character limit reached

AODTA: Airborne Object & Threat Dataset

Updated 24 January 2026
  • AODTA Dataset is a curated corpus of airborne imagery that enables dual-task deep learning for multi-class object classification and threat level assessment.
  • It aggregates data from diverse public sources and employs rigorous preprocessing, including normalization and augmentation, to ensure robust model training.
  • Benchmark protocols report high accuracies—96% for object classification and 90% for threat inference—highlighting its strong potential in security analytics.

The AODTA Dataset is a curated corpus of airborne object imagery designed for advancing automated object classification and threat-level inference in surveillance and airspace management scenarios. Originating from the need for scalable, robust, and real-time threat assessment of proliferating airborne platforms, the dataset was constructed through rigorous aggregation and normalization of multiple public sources. Its composition, annotation schema, preprocessing pipeline, class balances, and benchmark protocols provide a standardized foundation for the development and evaluation of deep learning models tailored for dual-objective tasks: multi-class categorization and ordinal threat determination (Chatterjee et al., 17 Jan 2026).

1. Data Aggregation and Curation Workflow

The AODTA Dataset derives from the synthesis and refinement of four major open-access image collections:

  • Commercial Aircraft Dataset (nelyg8002000_commercial_aircraft_2025, Kaggle CC0): 6,538 images of commercial and military jets.
  • Drone Dataset (M.M. Hasan 2022, Kaggle CC0): 2,194 images of unmanned aerial vehicles (UAVs), both hobbyist and professional.
  • Helicopter Dataset (nelyg8002000_helicopter_dual_rotor_2025, Kaggle CC0): 1,119 images featuring a diversity of helicopter configurations.
  • Birds vs. Drone Dataset (Harshwalia 2025, Kaggle CC0): 428 images representing live avian instances and ornithopter UAVs.

Selection criteria enforced exclusion of non-CC0 content, images smaller than 32×3232 \times 32 pixels, and specimens with severe compression artifacts. Redundant views were purged via exact-match perceptual hashing. Further, class ambiguity was controlled through manual spot-checking and bounding-box-based cropping, followed by global histogram normalization of contrast and brightness. The end result comprises 10,279 pre-augmentation, label-verified images uniformly cropped for central object focus.

2. Dataset Structure and Labeling Schema

AODTA supports four object classes and three threat levels, explicitly:

  • Object classes: Airplane, Drone (UAV), Helicopter, Bird.
  • Threat levels: Low (unarmed civilian), Medium (ambiguous/limited-capability), High (military platforms).

Labels are encoded as follows:

  • Object-class label c{0,1,2,3}c \in \{ 0, 1, 2, 3 \} for {Airplane, Drone, Helicopter, Bird}.
  • Threat-level label y{0,1,2}y \in \{ 0, 1, 2 \} for {Low, Medium, High}.
  • Multi-task annotation utilizes one-hot encodings, feeding two separate softmax classification heads.

The following table displays per-class sample distribution (raw):

Class Raw Count Proportion (%)
Airplane 6,538 63.6
Drone 2,194 21.3
Helicopter 1,119 10.9
Bird 428 4.2
Total 10,279 100.0

Threat-level assignment follows domain-informed mapping, e.g., "Civilian Jet" as Low, "Fighter Jet" as High. While precise class-threat cross tabulations are absent, the raw dataset exhibits an imbalance, with high-threat instances prevailing.

3. Preprocessing Pipeline and Augmentation

All images are resized to 32×32×332 \times 32 \times 3 and pixel values normalized to [0,1][0,1] as xnorm=xraw/255.0x_{\text{norm}} = x_{\text{raw}} / 255.0. Augmentation is online and performed via the Keras ImageDataGenerator, comprising:

  • Random rotations: $\pm20^\circ$
  • Width/height shifts: ±20%\pm 20\%
  • Shear intensity: up to 0.2
  • Zoom: [0.8,1.2][0.8, 1.2]
  • Horizontal flips enabled

Augmentation targets per-class equilibrium across the expanded dataset ($26,152$ instances total), with each category upsampled to match the largest (Airplane) post-augmentation count. Model training operates with a batch size of 8.

4. Partitioning and Dataset Statistics

The data is partitioned as follows:

  • Train/Test split: 70%/30% (7,195/3,084 images)
  • Validation split: 20% of training data (train: 5,756; val: 1,439; test: 3,084)

Balanced augmentation ensures representation equality across all four classes in the expanded dataset. The class balance in the raw dataset is:

pclass=k=nki=14ni,{pAppln,pDrone,pHeli,pBird}={0.636,0.213,0.109,0.042}p_{\text{class}=k} = \frac{n_k}{\sum_{i=1}^4 n_i}, \quad \{p_{\rm Appln},p_{\rm Drone},p_{\rm Heli},p_{\rm Bird}\} = \{0.636, 0.213, 0.109, 0.042\}

The mean image resolution is (32,32)(32,32) pixels post-resizing.

5. Annotation Encoding, Loss Functions, and Training Protocol

Each image in the AODTA corpus is annotated with both object class cc and threat level yy. Labels are encoded using one-hot vectors applied to two softmax output heads in a dual-task deep neural architecture. The overall loss is additive:

Ltotal=Lclass+LthreatL_{\text{total}} = L_{\text{class}} + L_{\text{threat}}

with categorical cross-entropy definitions:

Lclass=icilog(class_outputi)L_{\text{class}} = -\sum_i c_i \log(\text{class\_output}_i)

Lthreat=jyjlog(threat_outputj)L_{\text{threat}} = -\sum_j y_j \log(\text{threat\_output}_j)

These heads receive the feature vector ff output by the shared backbone network, with independent projection matrices and biases (W1,b1W_1, b_1 for class; W2,b2W_2, b_2 for threat).

6. Benchmarking Protocol and Performance Metrics

Benchmarking utilizes simultaneous multi-task training for object-class and threat-level prediction. Evaluation is performed separately for each task according to standard metrics:

  • Accuracy: Acc=TP+TNTP+TN+FP+FN\mathrm{Acc} = \frac{TP + TN}{TP + TN + FP + FN}
  • Precision: Prec=TPTP+FP\mathrm{Prec} = \frac{TP}{TP + FP}
  • Recall: Rec=TPTP+FN\mathrm{Rec} = \frac{TP}{TP + FN}
  • F1F_1-score: 2PrecRecPrec+Rec2 \cdot \frac{\mathrm{Prec} \cdot \mathrm{Rec}}{\mathrm{Prec} + \mathrm{Rec}}

Metrics are reported as per-class and aggregate macro/weighted averages. Final reported benchmarks (EfficientNetB4, on AODTA):

  • Object-class accuracy: 96%
  • Threat-level accuracy: 90%
  • Mean F1F_1 (class): 0.94\geq 0.94 for all classes
  • Mean F1F_1 (threat): 0.89 (High), 0.87 (Medium), 0.85 (Low)

A ResNet-50 baseline was also evaluated and found to underperform EfficientNetB4 on both classification and threat-level tasks.

7. Applications and Reproducibility

AODTA is intended for research in automated airborne object assessment tasks where both object categorization and threat stratification are central. Its curated construction, explicit dual-labeling, and balanced augmented protocol allow reproducibility of published results and direct comparison between baseline architectures. The authors explicitly provide sufficient procedural detail to enable replication of the dataset, preprocessing, augmentation, and evaluation pipeline as described (Chatterjee et al., 17 Jan 2026).

Potential applications include automated real-time surveillance, airspace defense infrastructure, and robust threat-monitoring systems utilizing pre-localized airborne object crops. The explicit exclusion of detection tasks and focus on classification/inference distinguishes AODTA among related datasets aimed at object detection. This suggests the dataset's primary utility lies in evaluation and development of end-to-end classification systems with threat-awareness, serving both the computer vision and security analytics communities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to AODTA Dataset.