A Robust and Low Complexity Deep Learning Model for Remote Sensing Image Classification

Published 5 Nov 2022 in cs.CV, cs.LG, and eess.IV | (2211.02820v2)

Abstract: In this paper, we present a robust and low complexity deep learning model for Remote Sensing Image Classification (RSIC), the task of identifying the scene of a remote sensing image. In particular, we firstly evaluate different low complexity and benchmark deep neural networks: MobileNetV1, MobileNetV2, NASNetMobile, and EfficientNetB0, which present the number of trainable parameters lower than 5 Million (M). After indicating best network architecture, we further improve the network performance by applying attention schemes to multiple feature maps extracted from middle layers of the network. To deal with the issue of increasing the model footprint as using attention schemes, we apply the quantization technique to satisfy the maximum of 20 MB memory occupation. By conducting extensive experiments on the benchmark datasets NWPU-RESISC45, we achieve a robust and low-complexity model, which is very competitive to the state-of-the-art systems and potential for real-life applications on edge devices.

Abstract PDF Upgrade to Chat

Citations (2)

View on Semantic Scholar

Summary

The paper proposes a low complexity RSIC model using EfficientNetB0 with multihead attention, achieving competitive accuracy with reduced memory usage.
It integrates data augmentation, transfer learning, and quantization to optimize training and enable effective deployment on edge devices.
Results on the NWPU-RESISC45 dataset demonstrate significant accuracy gains and memory efficiency, confirming the model's practical viability.

A Robust and Low Complexity Deep Learning Model for Remote Sensing Image Classification

Introduction

The paper "A Robust and Low Complexity Deep Learning Model for Remote Sensing Image Classification" addresses the critical task of remote sensing image classification (RSIC) by proposing an efficient and compact deep learning model designed for deployment on edge devices. The RSIC task involves classifying scenes captured by remote sensing technologies, which is pivotal for applications like urban planning, environmental monitoring, and natural hazard detection.

Model Architecture Overview

The proposed model leverages a multi-faceted approach involving the evaluation of lightweight deep neural network architectures, application of attention mechanisms, transfer learning, and quantization techniques to ensure the model remains under memory constraints while maintaining high accuracy. The architecture is strategically designed to incorporate the following components:

Benchmark Low-Complexity Networks: Initial experimentation involves four networks: MobileNetV1, MobileNetV2, NASNetMobile, and EfficientNetB0. EfficientNetB0 is identified as the optimal choice due to its favorable trade-off between accuracy and model complexity.
Figure 1: The high level architecture of proposed RSIC system.
Data Augmentation: Various data augmentation techniques, including random cropping, noise addition, and mixup, are employed to enhance the training dataset and improve generalization.
Figure 2: Data augmentation methods: Rotation, Random Cropping, Random Erasing, Adding Noise, and Mixup in the order.
Transfer Learning: EfficientNetB0 is pre-trained on the extensive ImageNet dataset and fine-tuned for RSIC, leveraging transfer learning to accelerate convergence and improve model accuracy.
Figure 3: Apply the transfer learning technique for the proposed deep neural network classification.
Attention Mechanisms and Multihead Attention: A novel multihead attention layer is introduced, focusing on enhancing feature map regions critical for classification. This layer takes into account spatial and channel dimensions to better represent feature dependencies.
Figure 4: Apply attentions schemes to further improve the proposed deep neural network classification.

Figure 5: Proposed Multihead attention based layer.

Results and Evaluation

The model's performance is evaluated on the NWPU-RESISC45 dataset, showcasing competitive accuracy scores across different training percentages (10% and 20% training data). The integration of quantization techniques further refines the model's memory requirements to meet edge device constraints, reducing the model's footprint to under 9.4 MB.

The EfficientNetB0 with applied multihead attention demonstrates robust performance improvements, achieving accuracy gains compared to baseline models without attention.
Results indicate that the proposed model's accuracy is competitive with state-of-the-art methods, with efficient memory usage that is suitable for edge deployment.

Conclusion

The presented work successfully merges several advanced techniques—transfer learning, attention mechanisms, and model quantization—to create a low-complexity yet powerful RSIC model. It sets the groundwork for future exploration into deploying AI models on resource-constrained devices without compromising performance. By maintaining a balance between accuracy and computational efficiency, this approach exemplifies a significant step towards practical applications in real-world remote sensing tasks.

Markdown Report Issue