Edge Based Oriented Object Detection

Published 15 Sep 2023 in cs.CV | (2309.08265v1)

Abstract: In the field of remote sensing, we often utilize oriented bounding boxes (OBB) to bound the objects. This approach significantly reduces the overlap among dense detection boxes and minimizes the inclusion of background content within the bounding boxes. To enhance the detection accuracy of oriented objects, we propose a unique loss function based on edge gradients, inspired by the similarity measurement function used in template matching task. During this process, we address the issues of non-differentiability of the function and the semantic alignment between gradient vectors in ground truth (GT) boxes and predicted boxes (PB). Experimental results show that our proposed loss function achieves $0.6\%$ mAP improvement compared to the commonly used Smooth L1 loss in the baseline algorithm. Additionally, we design an edge-based self-attention module to encourage the detection network to focus more on the object edges. Leveraging these two innovations, we achieve a mAP increase of 1.3% on the DOTA dataset.

Abstract PDF Upgrade to Chat

Summary

The paper introduces an Edge-Loss function inspired by template matching that improves angle estimation and yields a 0.6% mAP increase.
An Edge Self-Attention module integrated with an Edge-FPN refines feature focus, resulting in a 1.3% mAP boost on the DOTA dataset.
The approach overcomes limitations of horizontal bounding boxes, enhancing detection for aerial remote sensing and text detection tasks.

Detailed Summary of "Edge Based Oriented Object Detection"

The paper "Edge Based Oriented Object Detection" (2309.08265) presents innovations in the domain of oriented object detection, aiming to improve accuracy in detecting objects that can be oriented in arbitrary directions. Oriented object detection is a subset of object detection critical for fields such as aerial remote sensing and text detection, where targets might not align horizontally.

Introduction and Motivation

The conventional horizontal bounding box (HBB) approach is limiting when dealing with objects oriented at arbitrary angles, leading to excessive background inclusion and potential overlap among dense detection boxes. The paper addresses these problems by adopting oriented bounding boxes (OBB) with angular parameters for better enclosure of targets.

Methodological Innovations

The research introduces two key innovations:

Edge-Based Loss Function (Edge-Loss): Inspired by shape-based template matching, this novel loss function leverages edge gradient vectors to enhance the accuracy of angle detection. Traditional methods have struggled with non-differentiability and semantic misalignment between gradient vectors in GT and PB boxes. The Edge-Loss addresses these issues, achieving a 0.6% improvement in mAP over Smooth L1 loss within baseline algorithms.
Edge Self-Attention Module: Aimed at directing the detection network's focus towards object edges, this module enhances the detection accuracy by incorporating edge information into self-attention mechanisms. By utilizing an edge-focused Feature Pyramid Network (Edge-FPN), the approach further enhances mAP by 1.3% on the DOTA dataset.
Figure 1: The performance of some deep learning methods for oriented object detection on the DOTA dataset is not satisfactory, especially in terms of angle estimation accuracy.

The approach draws from prior work in template matching and object detection:

Template Matching: Traditional template matching methods, particularly those based on similarity calculations of edge gradients, have demonstrated exemplary accuracy in angle detection. The paper adapts these principles to object detection, addressing issues like non-differentiability and semantic misalignment.
Oriented Object Detection: Methods in this domain can be bifurcated into one-stage and two-stage approaches, with each offering distinct advantages in speed and accuracy. A critical insight of the paper is that even recent algorithms struggle with angle detection—a gap the proposed Edge-Loss aims to fill.

Architectural and Algorithmic Design

Edge Self-Attention Mechanism

The proposed Edge-FPN modifies the traditional FPN by incorporating edge maps into the attention mechanism. This enhanced attention enables the network to prioritize significant edge features, improving the network's accuracy and effectiveness in angle estimations.

Figure 2: Construction of our network.

Edge-Loss

The Edge-Loss function reformulates traditional similarity measures used in template matching for deep learning tasks. It incorporates mechanisms to handle non-differentiability and aligns semantic vectors across GT and PB boxes to improve object detection precision.

Figure 3: Construction of the edge self-attention.

Experimental Evaluation

The paper evaluates new methods on the DOTA dataset, showing significant improvements over baseline models like Oriented RCNN.

Results and Insights

Performance Enhancement: Both the Edge-FPN and Edge-Loss significantly improve mAP, showcasing their effectiveness in practical deployments.
Visual Improvements: The enhanced detection capability is particularly notable for regular-shaped objects such as aircraft, vehicles, and sports fields.
Figure 4: The corresponding position in PB can introduce semantic errors, where the vector doesn't precisely align with the bow of the ship.

Figure 5: The visualized results about Edge-Loss on the DOTA dataset.

Conclusion

The integration of edge-based loss functions and self-attention mechanisms into oriented object detection frameworks significantly enhances detection accuracy. These innovations provide a substantial performance boost, particularly in accurately estimating object angles, demonstrating the potential for broader application in fields reliant upon remote sensing and precise object orientation detection. The reported advancements invite further exploration and validation in varied contexts, extending their applicability and robustness.

Markdown Report Issue