Papers
Topics
Authors
Recent
Search
2000 character limit reached

BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

Published 12 May 2018 in cs.CV | (1805.04687v2)

Abstract: Datasets drive vision progress, yet existing driving datasets are impoverished in terms of visual content and supported tasks to study multitask learning for autonomous driving. Researchers are usually constrained to study a small set of problems on one dataset, while real-world computer vision applications require performing tasks of various complexities. We construct BDD100K, the largest driving video dataset with 100K videos and 10 tasks to evaluate the exciting progress of image recognition algorithms on autonomous driving. The dataset possesses geographic, environmental, and weather diversity, which is useful for training models that are less likely to be surprised by new conditions. Based on this diverse dataset, we build a benchmark for heterogeneous multitask learning and study how to solve the tasks together. Our experiments show that special training strategies are needed for existing models to perform such heterogeneous tasks. BDD100K opens the door for future studies in this important venue.

Citations (1,854)

Summary

  • The paper presents BDD100K, a large-scale, diverse driving dataset annotated for ten tasks to benchmark and enhance autonomous driving models.
  • The study demonstrates that integrating tasks through multitask learning, including segmentation and tracking, significantly improves model performance over single-task approaches.
  • The dataset’s extensive diversity in weather, lighting, and geographic conditions facilitates robust domain adaptation and real-world autonomous driving applications.

BDD100K: A Benchmark for Heterogeneous Multitask Learning in Autonomous Driving

Introduction

The field of computer vision has witnessed significant advancements, primarily driven by large-scale annotated datasets such as ImageNet and COCO. However, existing driving datasets are insufficient in supporting the multifaceted needs of autonomous driving. Researchers often face limitations due to the lack of diverse and rich datasets, thus constraining the exploration of complex multitask learning paradigms.

The BDD100K Dataset

BDD100K aims to bridge these gaps by offering a comprehensive annotated driving video dataset accompanied by extensive benchmarks for ten different tasks. The dataset includes more than 100,000 video clips, covering diverse scenarios including various weather conditions, geographical locations, and times of the day.

Data Collection and Annotation

Data was collected through crowd-sourcing facilitated by Nexar, capturing diverse driving conditions across multiple cities in the US. The dataset is annotated for multiple tasks, including but not limited to, image tagging, lane detection, drivable area segmentation, object detection, semantic segmentation, multiple object tracking (MOT), and multiple object tracking and segmentation (MOTS).

Benchmarks and Experimental Evaluations

Image Tagging

The dataset includes image-level annotations for weather, scene, and time of day, allowing for robust domain adaptation and transfer learning studies. Initial experiments using DLA-34 for classification tasks yielded average accuracies around 50-60%, showcasing a high level of diversity and complexity in the dataset.

Lane Detection and Drivable Area Segmentation

BDD100K offers detailed lane marking annotations and drivable area segmentations. Baseline experiments reveal the model's ability to extrapolate drivable areas even in the absence of clear lane markings. Lane marking evaluation, including attributes such as continuity and direction, shows improvement when jointly trained with drivable area segmentation, particularly for smaller training sets.

Object Detection and Semantic Segmentation

For object detection, Faster-RCNN trained on domain-specific subsets showed significant performance discrepancies particularly between city and non-city scenes as well as daytime and nighttime. Reasonable mean IoUs for semantic segmentation illustrate that models like DRN-D can benefit from the dataset's diversity. Domain differences with existing datasets such as Cityscapes were evident, highlighting the complementary nature of BDD100K.

Multiple Object Tracking (MOT) and Multiple Object Tracking and Segmentation (MOTS)

The tracking benchmark is notable for its large scale, featuring over 3 million bounding boxes. Baseline models trained on the MOT dataset demonstrated high levels of occlusion and re-identification challenges. For MOTS, the BDD100K dataset, integrating bounding boxes from detection and instance segmentation, improved significantly when leveraging additional resources from simpler tasks.

Multitask Learning Insights

The dataset facilitates multitask learning across homogeneous, cascaded, and heterogeneous settings:

  1. Homogeneous Multitask Learning: Joint training of lane marking and drivable area segmentation showed mutual benefits when training with smaller datasets, indicating the potential of multitask frameworks.
  2. Cascaded Multitask Learning: Significant improvements were observed in complex tasks like instance segmentation and multiple object tracking when trained jointly with simpler tasks like object detection.
  3. Heterogeneous Multitask Learning: The ultimate goal of integrating diverse tasks into a single model was explored through cascading and fine-tuning from pre-trained models. Notably, the combination of detection, instance segmentation, and tracking improved the segmentation tracking performance (MOTSA).

Conclusion

BDD100K offers a rich, diverse dataset that significantly advances research in autonomous driving and heterogeneous multitask learning. It provides an invaluable resource for developing and benchmarking algorithms that can generalize well across diverse driving scenarios. Future developments may include exploring annotation strategies and enhancing dataset diversity to cover a broader range of driving conditions. This dataset stands as a testament to the importance of comprehensive data in progressing towards fully autonomous vehicles, capable of handling complex real-world tasks.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.