Papers
Topics
Authors
Recent
Search
2000 character limit reached

Soldier Injury Dataset: UAV & Wearable Insights

Updated 20 December 2025
  • The dataset is a comprehensive benchmark designed for binary classification of soldier injury status using synthetic aerial and wearable sensor data.
  • It features detailed labeling protocols, few-shot splits, and augmentation strategies that address challenges like domain shift and low-shot learning.
  • The resource supports rapid triage algorithm development and casualty care, with optimized evaluation metrics for both UAV-based imaging and sensor data.

The Injured vs. Uninjured Soldier Dataset refers collectively to curated benchmarks designed for machine learning–driven casualty recognition and fall detection among military personnel in operational environments. The principal datasets currently described span both synthetic visual domains for UAV triage scenarios and wearable sensor-based fall activity records, supporting binary classification of injury status and automatic detection of incidents relevant to combat casualty care (Ahmad et al., 13 Dec 2025, Soares et al., 26 Jan 2025).

1. Data Sources and Operational Context

The Injured vs. Uninjured Soldier datasets explicitly target rapid triage and casualty identification within the high-stakes “platinum minutes” and “golden hour” timeframes, where swift decisions markedly improve survival outcomes. Two main modalities are represented:

  • Synthetic Aerial Images: Generated via GPT-4o’s image API, emulating UAV/drone surveillance with high-altitude perspectives. Operational scenarios include battlefield search-and-rescue, with extensive environmental variations (weather, terrain, occlusion, lighting) (Ahmad et al., 13 Dec 2025).
  • Wearable Inertial Data: Real sensor signals from Brazilian Navy volunteers using smartwatches and smartphones during standard daily living, military maneuvers, and simulated fall events, reflecting engagement conditions and physical risks (Soares et al., 26 Jan 2025).

This approach supports development and evaluation of algorithms robust to severe domain shift (e.g., camouflage, debris, suboptimal lighting, localized or occluded injury cues).

2. Dataset Composition and Annotation Protocols

Synthetic Visual Dataset

  • Image Count and Class Distribution: 6,000 aerial-style images; 3,000 labeled “injured,” 3,000 “uninjured” (class imbalance ratio r=1r=1).
  • Labeling: Embedded in script-controlled generative prompts; classes strictly binary—“injured” images must show explicit signs (blood, collapse, trauma) and “uninjured” show normal posture and attire.
  • Quality Control: Prompt templates enforce representation fidelity; manual spot checks confirm label consistency and injury depiction.
  • Resolution and Preprocessing: Images resized to 336×336 (global input); patches for CLIP encoding extracted via 3×3 and 4×4 grid tiling (total P=26P=26), then resized to 224×224.

Wearable Sensor Dataset

  • Cohort: N=15N=15 military participants; metadata includes rank, age, anthropometric dimensions.
  • Sensor Placement: Samsung Galaxy Watch 4 (both wrists); LG Velvet smartphone at chest (center of mass).
  • Activities and Segmentation: Standalone, locomotion, operational maneuvers, simulated falls—classified by scripted activity type.
  • Labeling Variants: Binary for “fall” (1) or “non-fall” (0); two distinctions for prone transitions (OM_6–8): ℓ₁ labels as fall, ℓ₂ as non-fall.
  • Windowing: Overlapping 5 s windows, 1 s stride; each inherits the parent activity’s class label.
  • **No proprietary files; full CSV format and public repositories provided.

3. Data Splits, Preprocessing, and Augmentation

Synthetic Visual Dataset

  • Few-Shot Splits: For standard KK-shot evaluation (K{1,2,4,8,16})(K\in\{1,2,4,8,16\}), each support set SS contains KK per class, with the query set QQ holding remaining images; proportions remain class-balanced for every split.
  • Augmentation: Random resize/crop, horizontal flip during training; deliberate environmental variability in original generation.
  • Patch Extraction: Multi-scale tiling increases granularity, enabling context-enriched patch embeddings for each image.

Wearable Sensor Dataset

  • Preprocessing Steps: Raw timestamp alignment, magnitude computations for accelerometer/gyroscope (R=ax2+ay2+az2R = \sqrt{a_x^2 + a_y^2 + a_z^2}), fixed-length vector extraction (time domain).
  • Frequency Domain Option: FFT applied to each window; DC term dropped, first half spectrum retained.
  • Splitting Protocols: Train/validate/test ratio of 60/20/20%, randomized by window but subject-consistent for cross-validation.

4. Model Architectures and Evaluation Metrics

Synthetic Visual Classification

  • Models Evaluated: TIP-Adapter-F (cache-based CLIP adapter) and patch-driven relational cache with gated graph attention.
  • Prediction Pipeline: Training phase distills patch-wise relational structure into cache; inference requires only standard cache lookup, yielding residual fusion between similarity scores and CLIP zero-shot logits.
  • Metrics:
    • Accuracy: TP+TNTP+TN+FP+FN\frac{TP+TN}{TP+TN+FP+FN}
    • Precision: TPTP+FP\frac{TP}{TP+FP}
    • Recall: TPTP+FN\frac{TP}{TP+FN}
    • F1-score: 2PrecisionRecallPrecision+Recall2\cdot\frac{\text{Precision}\cdot\text{Recall}}{\text{Precision}+\text{Recall}}
  • Experimental Protocol: All benchmarks averaged over three runs with random support selection.

Wearable Fall Detection

MCC=TP×TNFP×FN(TP+FP)(TP+FN)(TN+FP)(TN+FN)\mathrm{MCC} = \frac{\mathrm{TP}\times\mathrm{TN} - \mathrm{FP}\times\mathrm{FN}}{\sqrt{(\mathrm{TP}+\mathrm{FP})(\mathrm{TP}+\mathrm{FN})(\mathrm{TN}+\mathrm{FP})(\mathrm{TN}+\mathrm{FN})}}

  • Performance Metrics:
    • Sensitivity (Recall): SE=TPTP+FN\mathrm{SE} = \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}
    • Specificity: SP=TNTN+FP\mathrm{SP} = \frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{FP}}
    • Precision: PR=TPTP+FP\mathrm{PR} = \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}}
    • Confusion matrix: (TP, TN, FP, FN) per best model.
  • Best Results (ℓ₂ labeling, chest placement, time domain Sc4T): MCC = 0.9952, SE = 1.0000, SP = 0.9914, PR = 0.9991; confusion matrix (TP=1,092, TN=115, FP=1, FN=0) (Soares et al., 26 Jan 2025).

5. Benchmarking Results and Observations

Synthetic Visual Few-Shot Results

The patch-relational cache model demonstrated substantial improvement over baseline TIP-Adapter-F, especially in low-shot regimes:

Shots (K) TIP-Adapter-F (%) Patch-Relational Cache (%)
1 54.5 67.8
2 70.5 75.6
4 74.6 87.1
8 78.8 91.9
16 85.1 94.9

Key finding: +13.3% absolute gain at 1-shot, attributed to relational reasoning amplifying fine-grained injury cues even under occlusion and extreme class sparsity (Ahmad et al., 13 Dec 2025).

Wearable Fall Detection Results

  • Sensor Placement Impact: Chest-mounted smartphone consistently surpassed wrist-based sensors across all metrics.
  • Variables and Feature Sets: Full tri-axial linear + angular accelerations in time domain yielded optimal MCC and lowest false positive rates.
  • Activity Segmentation: Inclusion/exclusion of prone transitions (ℓ₁ and ℓ₂) impacts classifier calibration toward practical injury/operational thresholds.

A plausible implication is that integration across modalities—patch-based visual cues and inertial sensor data—could further optimize casualty identification in complex environments.

6. Use Cases and Limitations

Use Cases

  • On-board UAV triage assistance in battlefield/disaster zones.
  • Automated rapid casualty prioritization for medevac and ground rescue allocation.
  • Augmented decision support in low-visibility or high-clutter operational scenarios.
  • Real-time fall incident alerts for soldier safety and operational health monitoring.

Limitations

  • Synthetic image data lacks real-world variability (sensor noise, motion blur); may require domain adaptation for operational deployment.
  • Absence of demographic and geospatial metadata introduces potential for unknown bias.
  • Simulated fall events may not represent physiological and biomechanical signatures of live combat injuries.
  • Static scene generation precludes time-series tracking across multiple image frames.
  • Single weapon modeling does not generalize across alternate firearms and tactical equipment.

This suggests extension work should consider multimodal fusion, real-world injury data, and richer annotation for improved transferability and operational reliability.

7. Future Directions and Dataset Accessibility

  • Expansion to real UAV images and authentic combat injury records.
  • Enhancement of dataset annotation with demographic, geospatial, and temporal metadata.
  • Incorporation of additional sensor modalities (barometric, ECG, audio cues).
  • Exploration of ensemble and hybrid architectures for multi-label casualty inference.
  • Public access: Full sensor dataset, annotation schemas, and source code are released for reproducible research (Soares et al., 26 Jan 2025); synthetic dataset details and protocol outlined for integration into visual recognition model benchmarking (Ahmad et al., 13 Dec 2025).

In summary, the Injured vs. Uninjured Soldier datasets constitute a foundational resource for machine learning approaches to combat casualty care, providing rigorous benchmarks for few-shot image classification and sensor-based fall detection in time-critical operational contexts.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Injured vs. Uninjured Soldier Dataset.