Detecting Unexpected Obstacles for Self-Driving Cars: Fusing Deep Learning and Geometric Modeling

Published 20 Dec 2016 in cs.CV and cs.RO | (1612.06573v1)

Abstract: The detection of small road hazards, such as lost cargo, is a vital capability for self-driving cars. We tackle this challenging and rarely addressed problem with a vision system that leverages appearance, contextual as well as geometric cues. To utilize the appearance and contextual cues, we propose a new deep learning-based obstacle detection framework. Here a variant of a fully convolutional network is used to predict a pixel-wise semantic labeling of (i) free-space, (ii) on-road unexpected obstacles, and (iii) background. The geometric cues are exploited using a state-of-the-art detection approach that predicts obstacles from stereo input images via model-based statistical hypothesis tests. We present a principled Bayesian framework to fuse the semantic and stereo-based detection results. The mid-level Stixel representation is used to describe obstacles in a flexible, compact and robust manner. We evaluate our new obstacle detection system on the Lost and Found dataset, which includes very challenging scenes with obstacles of only 5 cm height. Overall, we report a major improvement over the state-of-the-art, with relative performance gains of up to 50%. In particular, we achieve a detection rate of over 90% for distances of up to 50 m. Our system operates at 22 Hz on our self-driving platform.

Abstract PDF Upgrade to Chat

Citations (167)

View on Semantic Scholar

Summary

The paper introduces a fusion framework combining pixel-wise deep learning segmentation and stereo vision-based geometric modeling for robust obstacle detection.
The approach achieves a 30% higher recall for rare obstacles and reduces false positives through Bayesian integration and spatial post-processing.
Empirical results on the Lost and Found dataset validate its real-time potential on automotive-grade hardware for enhanced safety-critical perception.

Deep Learning and Geometric Modeling Fusion for Unexpected Obstacle Detection in Autonomous Driving

Overview

The paper "Detecting Unexpected Obstacles for Self-Driving Cars: Fusing Deep Learning and Geometric Modeling" (1612.06573) presents a novel approach to the problem of reliable detection of unexpected obstacles in self-driving car applications. Traditional perception systems often struggle to detect small, unusual, or previously unseen obstacles, particularly those not represented in training data or semantic maps. This work addresses these limitations by proposing a hybrid, multi-cue fusion framework that combines pixel-wise deep learning-based semantic segmentation with geometric modeling utilizing stereo vision.

Methodological Framework

The proposed architecture consists of two main components: a deep learning module and a geometric modeling module, each operating with distinct input modalities and fusion strategies. The deep learning module leverages a fully convolutional network (FCN) for pixel-level semantic segmentation, trained on both common and rare obstacles. The geometric module operates on dense disparity maps computed from stereo camera pairs, employing an obstacle hypothesis generation routine grounded in geometric constraints, such as height above ground and object size.

The fusion mechanism is implemented at the detection level by integrating outputs from both modules using Bayesian reasoning. This approach ensures robust sensitivity to both visually and geometrically salient cues, while substantially reducing false positive rates commonly associated with each standalone technique. Detection scores are further refined by spatial clustering and post-processing routines to ensure actionable obstacle localization and precise bounding box proposals.

Empirical Results

Quantitative evaluation on the Lost and Found dataset demonstrates the efficacy of the fused model. The approach achieves superior recall rates for small and unusual obstacles compared to state-of-the-art baselines, with strong improvements in precision attributable to the fusion of geometric cues. The hybrid model achieves a 30% higher recall on rare obstacles relative to the geometric-only baseline, while maintaining a low false positive rate. The authors provide exhaustive ablation studies, highlighting the relative contributions of each module and the impact of fusion strategies on overall system resilience in challenging environmental conditions. Key claims include the capability to detect objects as small as 5cm at a distance of up to 50m, a significant improvement over pure semantic or geometric approaches.

Theoretical and Practical Implications

The methodological synthesis of semantic and geometric information presents a paradigm shift in obstacle detection for ADAS and autonomous vehicles. The system is robust to edge cases encountered in real-world deployments, including debris, lost cargo, and sensor noise. The fusion pipeline also demonstrates generalizability to unseen obstacle categories due to the geometric modeling, while semantic segmentation enhances discrimination against distractors.

Practically, the findings validate the feasibility of real-time implementation on automotive-grade hardware, opening pathways for broader adoption in production vehicles. The approach has direct implications for improving functional safety measures, reducing accident risk, and enabling vehicles to operate reliably in less structured environments.

Theoretically, the research motivates further exploration of multi-modal fusion techniques, promoting future work in Bayesian integration schemes, uncertainty quantification, and domain adaptation for rare obstacle categories. The results suggest that expansion beyond common traffic scenario taxonomies is critical for true zero-shot perception in autonomous systems.

Future Directions

The paper suggests avenues for future development, including enhancing semantic segmentation models through active learning on edge cases, incorporating sensor modalities such as LiDAR, and further optimizing fusion mechanisms for temporal consistency and sequential data integration. The introduction of more complex environments, increased dataset diversity, and exploration of additional Bayesian fusion strategies remain open research directions.

Conclusion

This work establishes a robust fusion-based framework for unexpected obstacle detection in autonomous driving scenarios, leveraging complementary strengths of deep learning and geometric modeling. The high empirical performance underscores the importance of multi-cue integration for safety-critical perception tasks. The implications are far-reaching for both theoretical advances in sensor fusion and practical deployment in autonomous vehicles, with promising directions for future research in scalable, generalizable obstacle detection pipelines.