xBD: A Dataset for Assessing Building Damage from Satellite Imagery

Published 21 Nov 2019 in cs.CV | (1911.09296v1)

Abstract: We present xBD, a new, large-scale dataset for the advancement of change detection and building damage assessment for humanitarian assistance and disaster recovery research. Natural disaster response requires an accurate understanding of damaged buildings in an affected region. Current response strategies require in-person damage assessments within 24-48 hours of a disaster. Massive potential exists for using aerial imagery combined with computer vision algorithms to assess damage and reduce the potential danger to human life. In collaboration with multiple disaster response agencies, xBD provides pre- and post-event satellite imagery across a variety of disaster events with building polygons, ordinal labels of damage level, and corresponding satellite metadata. Furthermore, the dataset contains bounding boxes and labels for environmental factors such as fire, water, and smoke. xBD is the largest building damage assessment dataset to date, containing 850,736 building annotations across 45,362 km\textsuperscript{2} of imagery.

Abstract PDF Upgrade to Chat

Citations (244)

View on Semantic Scholar

Summary

The paper presents xBD as the largest annotated dataset for building damage assessment via satellite imagery.
It details a robust multi-stage annotation process and a four-tier damage classification system informed by FEMA guidelines.
Baseline models utilizing modified U-Net and ResNet50 architectures achieve moderate performance, indicating room for future improvements.

xBD: A Dataset for Assessing Building Damage from Satellite Imagery

The paper "xBD: A Dataset for Assessing Building Damage from Satellite Imagery" introduces xBD, a substantial dataset designed to advance the field of building damage assessment from satellite imagery, targeting applications in humanitarian assistance and disaster response (HADR). xBD is proposed as the largest existing dataset for building damage evaluation, showcasing over 850,000 annotated buildings across more than 45,000 km $^2$ .

Dataset Composition and Characteristics

The xBD dataset is constructed from high-resolution pre- and post-event satellite imagery, offering comprehensive geographic and disaster diversity. It encompasses multiple disaster types, including hurricanes, earthquakes, and floods, with accompanying metadata and damage annotations. Importantly, xBD includes visual contextual elements such as fire and water, enhancing its utility for machine learning model development in damage detection.

Coverage by Event Type:

Figure 1: From top left (clockwise): Hurricane Harvey; Palu Tsunami; Mexico City Earthquake; Santa Rosa Fire. Imagery from DigitalGlobe.

Figure 3: Disaster types and disasters represented in xBD around the world.

The dataset strategically integrates both damaged and undamaged structures to aid in accurate model training. It follows a four-tiered damage classification system, prioritizing granularity over simplistic binary labels. This classification framework was informed by the FEMA HAZUS model and corroborated with insights from disaster experts.

Dataset Collection and Annotation Process

The xBD dataset is compiled through a meticulous multi-step annotation process. Imagery is sourced primarily from the Maxar/DigitalGlobe Open Data Program, ensuring consistent quality and resolution. Pre-disaster building footprint annotations provide the baseline for post-disaster damage assessment.

Annotation Steps Involved:

Triage: Identifies imagery containing damage.
Polygon Annotation: Utilizes pre-disaster imagery for footprint delineation.
Damage Classification: Applies the Joint Damage Scale to post-disaster imagery.
Quality Control: Ensures high annotation accuracy with expert reviews and iterative refinement processes.
Figure 2: Building polygons (shown in green) on imagery from Hurricane Michael (2018).

Through this systematic approach, xBD achieves high annotation fidelity, with expert verifications pegging initial mislabeling at around 2-3%, which are subsequently corrected.

Model Baselines and Evaluation

The construction of xBD also involves the deployment of baseline models deriving from altered U-Net architectures and ResNet50 backbones.

Modeling Insights:

Localization Model: Modifies U-Net for effective pre-/post-disaster imagery alignment, achieving a notable IoU of 0.66 for buildings.
Classification Model: Combines outputs from a ResNet50 and a shallower CNN, optimizing via ordinal cross-entropy loss to respect the ordinal nature of damage classifications.
Figure 6: Architecture of the baseline classification model.

Despite these efforts, inherent dataset imbalances and nuanced damage indicators challenge performance. The classification baseline achieves a weighted F1 score of 0.2654, highlighting room for enhanced training regimens or more sophisticated modeling refinements.

Conclusion

The development of xBD marks a crucial step in automating the building damage assessment process via satellite imagery, potentially transforming HADR operations. This dataset equips researchers with a rich resource to develop more robust AI models capable of nuanced damage detection across diverse geographic and environmental conditions.

The dissemination of xBD opens avenues for future work in model generalization, leveraging the dataset's breadth to craft solutions applicable across disparate disaster types and regions. Such advancements promise to augment the precision and agility of disaster response interventions, heralding improvements in both speed and safety.