Large-scale Building Damage Assessment using a Novel Hierarchical Transformer Architecture on Satellite Images

Published 3 Aug 2022 in cs.CV | (2208.02205v3)

Abstract: This paper presents \dahitra, a novel deep-learning model with hierarchical transformers to classify building damages based on satellite images in the aftermath of natural disasters. Satellite imagery provides real-time and high-coverage information and offers opportunities to inform large-scale post-disaster building damage assessment, which is critical for rapid emergency response. In this work, a novel transformer-based network is proposed for assessing building damage. This network leverages hierarchical spatial features of multiple resolutions and captures the temporal differences in the feature domain after applying a transformer encoder on the spatial features. The proposed network achieves state-of-the-art performance when tested on a large-scale disaster damage dataset (xBD) for building localization and damage classification, as well as on LEVIR-CD dataset for change detection tasks. In addition, this work introduces a new high-resolution satellite imagery dataset, Ida-BD (related to 2021 Hurricane Ida in Louisiana in 2021) for domain adaptation. Further, it demonstrates an approach of using this dataset by adapting the model with limited fine-tuning and hence applying the model to newly damaged areas with scarce data.

Abstract PDF Upgrade to Chat

Citations (27)

View on Semantic Scholar

Summary

The paper presents DAHiTrA, a novel model that integrates hierarchical transformers to extract multi-resolution spatial features and temporal differences for superior building damage detection.
It demonstrates state-of-the-art performance on the xBD dataset, outperforming traditional CNN approaches with enhanced F1 scores and IoU through early feature differentiation and cross-temporal attention.
The model's effective domain adaptation, as shown on the Ida-BD dataset, ensures practical deployment in diverse post-disaster scenarios with minimal fine-tuning.

Large-scale Building Damage Assessment using a Novel Hierarchical Transformer Architecture on Satellite Images

Introduction

The paper, "Large-scale Building Damage Assessment using a Novel Hierarchical Transformer Architecture on Satellite Images," presents DAHiTrA, a deep-learning model employing hierarchical transformers for classifying building damages in post-disaster scenarios using satellite images. Utilizing high-resolution satellite imagery, DAHiTrA aims to streamline large-scale damage assessments—an essential step for efficient emergency response. The model's architecture directly addresses the task of building damage detection by combining hierarchical spatial feature encoding with temporal difference analysis, ultimately achieving state-of-the-art performance on the xBD dataset for both building localization and damage classification tasks.

The demand for rapid damage assessment in post-disaster scenarios is rising, underscoring the need for automated systems that can efficiently process satellite imagery. Previous methods primarily focused on CNNs, emphasizing feature concatenation and segmentation tasks; DAHiTrA advances these methodologies by integrating transformer-based features and hierarchical processing, significantly enhancing damage classification performance.

Model Architecture

DAHiTrA integrates a hierarchical UNet-based architecture with transformer modules to improve the accuracy and reliability of building damage assessments. The model processes pairs of pre- and post-disaster satellite images through a convolutional encoder to extract multi-resolution spatial features. These features are subsequently processed through a difference block employing transformers to map these features into a common domain—crucial for isolating meaningful differences indicative of damage.

Figure 1: The model architecture for damage detection and classification.

A key innovation in DAHiTrA is the use of difference blocks, constituted by pairs of transformer encoders and decoders. These components allow the extraction of temporal differences at multiple resolutions, ensuring robust classification across varied damage scales. This hierarchical approach facilitates the construction of damage masks by recurrently upsampling and concatenating features from lower-dimension layers.

Comparison with Existing Models

Comparative analyses show DAHiTrA outperforming state-of-the-art methods, as seen in the xBD dataset benchmarks. Unlike traditional Siamese and CNN architectures, which aggregate pre- and post-disaster features at later model stages, DAHiTrA's approach ensures feature differentiation earlier in the process, enhancing localization and classification accuracy.

Moreover, the transformer-based architecture allows DAHiTrA to excel over fusion-based models like BDANet and Dual-HRNet by incorporating cross-temporal attention mechanisms and efficient hierarchical feature construction—resulting in cleaner output masks and higher fidelity in segmentation tasks.

Figure 2: Comparing the model architecture of DAHiTrA with two recent works for change detection, ChangeFormer and BiT.

Evaluation and Results

Quantitative evaluations demonstrate DAHiTrA’s superior performance metrics, including higher F1 scores and IoU in the damage detection task on the xBD dataset, outperforming models such as Siamese UNet and RescueNet. Qualitative results validate the model's capacity for producing precise damage assessments with minimized noise and error propagation.

The practical application of DAHiTrA is extended to change detection tasks using the LEVIR-CD dataset, where the model similarly exhibits robust performance enhancements over recent transformer-based models, partly due to its multi-resolution feature extraction and hierarchical processing capabilities.

Figure 3: Qualitative results for damage classification (evaluation on xBD dataset).

Domain Adaptation

A notable contribution of the paper is the introduction of the Ida-BD dataset following Hurricane Ida, facilitating domain adaptation from the xBD dataset. The task of adapting DAHiTrA for the Ida-BD dataset illustrates the model's versatility in handling new disaster scenarios with minimal fine-tuning—a critical advantage for prompt deployment in real-world events.

Figure 4: Qualitative results for domain adaptation (evaluation on Ida-BD dataset).

Conclusion

DAHiTrA exemplifies the merging of hierarchical feature extraction with transformer-based temporal difference modeling, offering enhanced precision and efficacy in large-scale building damage assessments. Its application spans various post-disaster scenarios, ensuring adaptability through domain adaptation techniques and real-time analysis capabilities. Future work may focus on refining boundary detection and exploring dynamic learning algorithms to further elevate the model's performance across diverse tasks and datasets. Through these advancements, DAHiTrA has the potential to significantly impact decision-making processes in disaster response and management operations.