Automatic Vision-Based Parking Slot Detection and Occupancy Classification

Published 16 Aug 2023 in cs.CV | (2308.08192v1)

Abstract: Parking guidance information (PGI) systems are used to provide information to drivers about the nearest parking lots and the number of vacant parking slots. Recently, vision-based solutions started to appear as a cost-effective alternative to standard PGI systems based on hardware sensors mounted on each parking slot. Vision-based systems provide information about parking occupancy based on images taken by a camera that is recording a parking lot. However, such systems are challenging to develop due to various possible viewpoints, weather conditions, and object occlusions. Most notably, they require manual labeling of parking slot locations in the input image which is sensitive to camera angle change, replacement, or maintenance. In this paper, the algorithm that performs Automatic Parking Slot Detection and Occupancy Classification (APSD-OC) solely on input images is proposed. Automatic parking slot detection is based on vehicle detections in a series of parking lot images upon which clustering is applied in bird's eye view to detect parking slots. Once the parking slots positions are determined in the input image, each detected parking slot is classified as occupied or vacant using a specifically trained ResNet34 deep classifier. The proposed approach is extensively evaluated on well-known publicly available datasets (PKLot and CNRPark+EXT), showing high efficiency in parking slot detection and robustness to the presence of illegal parking or passing vehicles. Trained classifier achieves high accuracy in parking slot occupancy classification.

Abstract PDF Upgrade to Chat

Citations (23)

View on Semantic Scholar

Summary

The paper introduces APSD-OC, a fully automatic system that detects parking slots using YOLOv5 and DBSCAN without manual annotation.
It applies perspective normalization to transform vehicle detections, ensuring robust slot localization even under occlusions and varying weather conditions.
The ResNet34 classifier achieves over 99% accuracy on PKLot and CNRPark-EXT, demonstrating high generalization and efficiency.

Automatic Vision-Based Parking Slot Detection and Occupancy Classification

Introduction

The paper presents APSD-OC, a fully automatic vision-based algorithm for parking slot detection and occupancy classification. The approach eliminates the need for manual slot annotation, a major bottleneck in deploying scalable parking guidance information (PGI) systems. APSD-OC leverages vehicle detection across temporal image sequences, perspective normalization, density-based clustering, and a deep classifier for robust slot localization and occupancy inference. The method is evaluated on PKLot and CNRPark-EXT, two public datasets with diverse weather, occlusion, and viewpoint conditions, demonstrating high precision, recall, and generalization.

Figure 1: An example of a parking lot image from PKLot dataset. Properly parked vehicles are marked with blue bounding box.

System Architecture

The APSD-OC pipeline consists of two main stages: (1) automatic parking slot detection and (2) slot occupancy classification. The detection stage processes a sequence of images from a fixed camera, applies vehicle detection (YOLOv5), transforms detections to a bird's eye view, and clusters detection centers using DBSCAN to infer slot locations. The occupancy classification stage crops detected slot regions and classifies them as occupied or vacant using a fine-tuned ResNet34.

Figure 2: The block diagram of the proposed APSD-OC algorithm.

Vehicle Detection

YOLOv5x, pretrained on COCO, is used for vehicle detection, focusing on "car" and "truck" classes with a confidence threshold of 0.5. Images are resized to $1280 \times 1280$ px. The detector outputs bounding boxes (BBs) for each vehicle, and their centers are extracted for further processing.

Figure 3: Vehicle detection in a single input image.

Perspective Transformation

To address perspective distortion, BB centers are mapped to a bird's eye view using a homography matrix $\mathbf{H}$ , estimated via a CNN regressor trained on synthetic data (following [Abbas et al., (Abbas et al., 2019)]). This normalization ensures uniform cluster density across the parking lot, facilitating robust clustering.

Figure 4: Transformed BBs centers of vehicle detections for N input images.

Clustering and Slot Localization

DBSCAN is applied to the transformed BB centers to identify high-density regions corresponding to parking slots. Clusters with high intra-cluster variance are filtered out, as they typically correspond to illegal parking or transient vehicle presence. The number of slots to retain is a user-supplied parameter, easily obtainable from the scene.

Figure 5: Filtered cluster centers $\mathbf{m}_i$ corresponding to detected parking slots.

Cluster centers are then mapped back to the original image coordinates, and bounding boxes are defined for each slot based on the mean BBs in each cluster.

Occupancy Classification

Each detected slot region is cropped and classified as occupied or vacant using a ResNet34 model, pretrained on ImageNet and fine-tuned on PKLot and CNRPark-EXT. The classifier head is replaced and trained with a 1cycle learning rate policy, linear warmup, cosine annealing, and cyclical momentum (Adam optimizer). The approach follows the Amato split for training/testing to ensure comparability with prior work.

Figure 6: Parking lot with all vehicles properly parked.

Figure 7: Parking lot with several properly parked vehicles.

Experimental Evaluation

Datasets

PKLot: 12,417 images, 695,900 slot annotations, three camera views, diverse weather.
CNRPark-EXT: 4,278 images, 144,965 annotations, nine cameras, challenging occlusions.

Figure 8: PKLot PUCPR sunny

Parking Slot Detection

Detection is evaluated using precision and recall, with ground truth slot counts established via manual annotation due to incomplete dataset labels. As the number of input images increases, recall improves significantly, especially for large lots (e.g., PUCPR). For UFPR05, using all images yields 97.73% precision and recall. CNRPark-EXT results show 100% precision for several cameras, with recall limited primarily by occlusions.

Key findings:

Precision and recall exceed 90% with sufficient temporal coverage.
Robustness to illegal parking and passing vehicles is demonstrated.
Only the number of visible slots is required as a user input.

Occupancy Classification

Classification is benchmarked against CarNet and mAlexNet. The ResNet34 classifier achieves the highest accuracy in 7/9 PKLot train/test splits, with AUC > 0.99 in all cases. On CNRPark-EXT, the method outperforms mAlexNet and AlexNet by a large margin, achieving >99% accuracy when trained on diverse viewpoints and weather.

Notable results:

PKLot: 99.98% accuracy (UFPR04), 99.92% (UFPR05), 99.93% (PUCPR) in intra-lot splits.
CNRPark-EXT: 99.67% accuracy, 0.9981 AUC with full training set.
Generalization: High robustness to viewpoint and weather changes; >98% accuracy in cross-condition tests.

Implementation Considerations

Computational Requirements: YOLOv5x and ResNet34 are efficient for real-time inference on modern GPUs; edge deployment is feasible with lighter YOLO variants and quantized classifiers.
Scalability: The method is camera-agnostic, requiring only the number of slots per view. No manual annotation or camera calibration is needed.
Limitations: Recall may be affected by persistent occlusions or insufficient temporal coverage. The method assumes a fixed camera position during slot detection.
Deployment: The pipeline can be integrated into existing surveillance infrastructure, providing real-time PGI with minimal operational overhead.

Theoretical and Practical Implications

The APSD-OC framework demonstrates that slot localization can be reliably inferred from vehicle detection statistics over time, obviating the need for manual slot annotation or explicit parking line detection. The use of perspective normalization and density-based clustering generalizes across diverse scenes and camera geometries. The decoupling of slot detection and occupancy classification enables modular upgrades and adaptation to new environments.

Contradictory to prior claims, the results show that fully automatic slot detection is feasible and robust, even in the presence of occlusions and non-rectangular layouts, provided sufficient temporal data is available.

Future Directions

Potential extensions include:

Automatic estimation of slot count via spatial analysis of detection distributions.
Incorporation of spatial priors (e.g., slot alignment, regularity) to further improve detection in highly irregular lots.
Temporal smoothing for occupancy classification to handle transient occlusions and improve robustness.
Edge deployment with model compression and hardware acceleration for large-scale, city-wide PGI systems.

Conclusion

APSD-OC provides a practical, scalable, and robust solution for vision-based parking slot detection and occupancy classification. By leveraging temporal vehicle detection, perspective normalization, and deep learning, the method achieves high accuracy and generalization across challenging real-world datasets. The approach removes the need for manual annotation and is readily deployable in diverse urban environments, supporting the development of intelligent PGI systems and smart city infrastructure.