Direction-aware Spatial Context Features for Shadow Detection

Published 12 Dec 2017 in cs.CV | (1712.04142v2)

Abstract: Shadow detection is a fundamental and challenging task, since it requires an understanding of global image semantics and there are various backgrounds around shadows. This paper presents a novel network for shadow detection by analyzing image context in a direction-aware manner. To achieve this, we first formulate the direction-aware attention mechanism in a spatial recurrent neural network (RNN) by introducing attention weights when aggregating spatial context features in the RNN. By learning these weights through training, we can recover direction-aware spatial context (DSC) for detecting shadows. This design is developed into the DSC module and embedded in a CNN to learn DSC features at different levels. Moreover, a weighted cross entropy loss is designed to make the training more effective. We employ two common shadow detection benchmark datasets and perform various experiments to evaluate our network. Experimental results show that our network outperforms state-of-the-art methods and achieves 97% accuracy and 38% reduction on balance error rate.

Abstract PDF Upgrade to Chat

Citations (181)

View on Semantic Scholar

Summary

The paper proposes the novel DSC module that integrates a spatial RNN with attention to learn directional context for shadow detection.
It achieves significant performance improvements, with up to 97% accuracy and a 38% reduction in BER on benchmark datasets.
The approach utilizes end-to-end training with deep supervision, suggesting potential extensions to other image segmentation tasks.

Direction-aware Spatial Context Features for Shadow Detection

This paper proposes a novel approach to shadow detection based on analyzing image context in a direction-aware manner. The methodology involves a spatial recurrent neural network (RNN) enhanced with an attention mechanism that considers different directional contexts and embeds this into a convolutional neural network (CNN). The approach significantly improves shadow detection accuracy and balance error rate (BER).

Methodology

Architecture

The core innovation of the paper is the Direction-aware Spatial Context (DSC) module integrated into a CNN for shadow detection. The DSC module employs a spatial RNN to aggregate spatial context along four principal directions—left, right, up, and down—with the incorporation of attention weights to balance the contribution from each direction.

Figure 1: The schematic illustration of the overall shadow detection network.

The network architecture proceeds as follows:

Feature Extraction: Utilize CNN layers to extract features at multiple scales from the input image.
DSC Module Integration: Embed a DSC module to learn spatial context with directional variance.
Feature Concatenation: DSC features are concatenated with convolutional features, upsampled, and fused to create a multi-level integrated feature (MLIF).
Score Map Prediction: Use deep supervision to predict score maps for each layer, which are fused to generate the final shadow detection map.

The entire system is trained in an end-to-end manner using a weighted cross-entropy loss designed to prioritize both shadow and non-shadow regions effectively.

Direction-aware Spatial Context Module

The DSC module is built upon the spatial RNN concept, but differs by utilizing direction-aware attention. This attention mechanism selectively emphasizes spatial context features learned in different directions.

Figure 2: The schematic illustration of the {\em direction-aware spatial context module\/}.

The computation within each DSC module is structured as follows:

Spatial RNN Aggregation: Propagate spatial context features twice to gather comprehensive directional information.
Attention Mechanism: Assign learnable attention weights to each direction's aggregated spatial context.
Feature Concatenation and Translation: Apply $1\times1$ convolutions for dimensionality reduction and further data translation.

Experimental Results

The proposed network was evaluated against state-of-the-art methods using two benchmark datasets, SBU Shadow Dataset and UCF Shadow Dataset. The results demonstrate superior accuracy and BER, notably achieving 97% accuracy and a 38% reduction in BER on common shadow detection benchmark datasets.

Figure 3: Visual comparison of shadow maps produced by our method and other methods against ground truths.

Table 1 summarizes the comparison between the proposed DSC approach and other methods in terms of accuracy and balance error rate. The DSC network consistently outperforms others in detecting complex shadow patterns, showing resilience against false positives often caused by black objects misrecognized as shadows.

Method	SBU Accuracy	SBU BER	UCF Accuracy	UCF BER
DSC (ours)	0.97	5.59	0.95	8.10
scGAN	0.90	9.10	0.87	11.50
Stacked-CNN	0.88	11.00	0.85	13.00
Unary-Pairwise	0.86	25.03	-	-

Conclusion

The paper introduces a highly effective network for single-image shadow detection that leverages direction-aware spatial context via a novel DSC module. The method enables finer discrimination between shadow and non-shadow regions by understanding and utilizing the directionality of spatial context features. The robust performance across different datasets indicates that this network architecture could be extended to other image segmentation tasks, such as saliency detection and semantic image segmentation. Future work may explore real-time applications and the detection of time-varying shadows in dynamic scenes.