Papers
Topics
Authors
Recent
Search
2000 character limit reached

Building Height Estimation Using Shadow Length in Satellite Imagery

Published 14 Nov 2024 in cs.CV and eess.IV | (2411.09411v1)

Abstract: Estimating building height from satellite imagery poses significant challenges, especially when monocular images are employed, resulting in a loss of essential 3D information during imaging. This loss of spatial depth further complicates the height estimation process. We addressed this issue by using shadow length as an additional cue to compensate for the loss of building height estimation using single-view imagery. We proposed a novel method that first localized a building and its shadow in the given satellite image. After localization, the shadow length is estimated using a regression model. To estimate the final height of each building, we utilize the principles of photogrammetry, specifically considering the relationship between the solar elevation angle, the vertical edge length of the building, and the length of the building's shadow. For the localization of buildings in our model, we utilized a modified YOLOv7 detector, and to regress the shadow length for each building we utilized the ResNet18 as backbone architecture. Finally, we estimated the associated building height using solar elevation with shadow length through analytical formulation. We evaluated our method on 42 different cities and the results showed that the proposed framework surpasses the state-of-the-art methods with a suitable margin.

Summary

  • The paper presents a hybrid framework that combines YOLOv7 detection and ResNet18 regression to infer building heights from shadow lengths and solar angles.
  • The methodology integrates deep learning with classical photogrammetry, significantly reducing RMSE and outperforming traditional techniques.
  • A novel dataset from 42 Chinese cities with detailed annotations was curated, enhancing model reliability and evaluation.

Building Height Estimation Using Shadow Length in Satellite Imagery

The estimation of building heights utilizing shadow lengths as captured in satellite imagery presents an intriguing advancement in the field of remote sensing. This paper elaborates on a framework that leverages this approach through a combination of detection, regression, and photogrammetry, surpassing the capabilities of contemporary methods significantly in terms of accuracy and applicability.

Framework Overview

The methodology employs monocular satellite imagery, addressing the inherent 3D spatial information loss by utilizing shadow lengths as compensatory cues. The process begins with the detection of buildings and their shadows using a modified YOLOv7 object detection model, capable of accurately localizing these structures within satellite images. Subsequently, ResNet18 is employed to regress the shadow lengths, which, along with solar elevation angles, are used to infer building heights using photogrammetric principles. Figure 1

Figure 1: Overview of the proposed framework for building height estimation using shadow length.

The proposed method integrates both deep learning and traditional mathematical photogrammetric models, yielding a system that not only detects and estimates shadow lengths but also utilizes these lengths in conjunction with solar angles to determine building heights. The comprehensive integration of these elements facilitates the attainment of superior performance metrics compared to existing frameworks.

Dataset and Annotation

A new dataset was crafted, extending an existing dataset focused on 42 Chinese cities, incorporating detailed annotations such as building heights, shadow lengths, and bounding boxes for enhanced accuracy and usability. A custom annotation tool was developed to facilitate the accurate marking of shadow lengths while accounting for geographical metadata crucial for the precise calculation of solar elevation angles. Figure 2

Figure 2

Figure 2: (a) Box plots of Root Mean Square Error on the dataset, plotted across values of ground truth height. (b) Bar plot representing the average (mean) Root Mean Square Error plotted against values of ground truth height. We can observe that the range of values that RMSE takes is small for buildings lying in the 12-30m range. The range of RMSE for buildings in the height range of 3-9m is pretty large which suggests noise. Moreover, buildings with a height >30m show very large RMSE.

A comprehensive analysis of this dataset revealed notable noise and imbalances in label distribution, particularly within short structures, prompting the implementation of specific filtering and adjustments to improve model reliability.

Methodology

The framework's core consists of three primary stages: localization, shadow estimation, and height determination.

  1. Localization is achieved via modified YOLOv7, which delineates bounding boxes around desired structures.
  2. Shadow Estimation employs these localized images, where shadows are extracted and used within regression models to predict shadow lengths accurately.
  3. Height Determination utilizes these shadow lengths in a well-defined photogrammetric equation, incorporating solar elevation angles for calculating building heights:

H=Sltan(σ)H = S_l \tan(\sigma)

where HH is the building height, SlS_l is the shadow length, and σ\sigma is the solar elevation angle [REDA2004].

The methodology emphasizes a tight integration between empirical deep learning models and mathematical frameworks, thereby enhancing both computational efficiency and the interpretability of results.

Results and Evaluation

The framework's efficacy was assessed against notable baseline models, including MM3^3Net, achieving a significant reduction in root mean square error (RMSE) of building height estimates. The robust performance highlights the capability of monocular imagery combined with analytical techniques to rival, and in some cases exceed, more complex multi-spectral image-based approaches. Figure 3

Figure 3

Figure 3: (a) YOLOv7 Predictions (b) Bounding box ground truth.

Conclusions

This study presents a comprehensive methodology utilizing shadow length to estimate building heights from satellite imagery, integrating powerful detection and regression models with classical photogrammetry. This results in a hybrid framework with high precision and applicability across diverse urban environments. Future work may expand into alternative photogrammetric methods and broaden evaluation datasets to consolidate and validate these findings further. The integration of this technique into urban planning and management tools could provide substantial benefits, offering low-cost, high-scale solutions for monitoring urban sprawl and infrastructure development.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.