- The paper proposes a safety metric that combines object detection quality (via CLEAR metrics) with collision relevance and timing constraints.
- It employs a weighted average method to balance detection and tracking performance across real-world and simulated scenarios.
- The results highlight that scene-specific factors, such as critical collision potential, significantly impact the overall safety score.
A Comprehensive Safety Metric to Evaluate Perception in Autonomous Systems
Introduction
The paper "A Comprehensive Safety Metric to Evaluate Perception in Autonomous Systems" (2512.14367) addresses the critical challenge of ensuring the safety of perception systems in autonomous vehicles. It introduces a novel safety metric that integrates multiple object-related factors such as velocity, orientation, distance, and potential collision impact, forming a unified and interpretable safety score. This metric evaluates object perception in terms of both quality and relevance, facilitating the comparison of different perception systems across various scenarios and environmental conditions.
Methodology
Safety Metric Composition
The proposed safety metric builds on key elements of object perception, focusing on quality, relevance, and timing. Quality is evaluated using the CLEAR metrics, which provide detailed measures of precision (MOTP) and accuracy (MOTA) for both object detection and tracking. These metrics are augmented by a distance-based IoU verification that prioritizes precision for nearby objects due to their higher safety criticality.
Figure 1: Process overview of the single components and their relation to one another to determine the safety metric score S. Ego vehicle is black. Red circle around ego indicates safety critical area.
Relevance is incorporated by assessing each object's collision potential via longitudinal and lateral safety distances derived from the RSS model. Objects not detected within predefined safety distances are flagged as critical, with collision impacts rated on a scale reflecting potential injury severity.
Timing is critical in real-time perception. Detection times are weighted based on proximity to safety-critical areas, and delays impacting emergency response capability are penalized in the safety score calculation.
Metric Evaluation Strategy
The final comprehensive safety score S is derived as a weighted average of detection and tracking scores, accommodating flexibility to emphasize detection or tracking based on application requirements. This is represented mathematically as
S=wDSD+wTST
where wD and wT are the respective weights for detection and tracking, constrained such that wD+wT=1. This adaptability allows the metric to be tailored across varying autonomous systems and environmental scenarios.
Results and Evaluation
The paper presents evaluations on both real-world (KITTI dataset) and virtual scenarios to demonstrate the metric's robustness and applicability. In the KITTI dataset scenarios, the comprehensive safety score is consistently lower for frameworks failing to detect VRUs, highlighting the importance of considering collision relevance and environmental context in safety evaluation.
For the virtual motorway scenarios, high velocities and dense traffic yield lower safety scores due to increased penalties for missed detections. Conversely, rural scenarios with fewer critical interactions result in significantly higher scores, illustrating the metric's scene-sensitive nature.
Figure 2: Schematical identification of collision relevant objects from KITTI raw dataset.
Figure 3: Exemplary bird's eye view detection on KITTI raw dataset scene as in Figure 2. Blue boxes represent ground truth, green ones are correctly detected objects, and red ones are safety critical objects.
Conclusion
This paper proposes an advanced safety metric that effectively integrates elements of detection precision, tracking capability, and collision relevance, offering an insightful measure of safety for autonomous perception systems. By accounting for context-sensitive factors such as collision severity and real-time constraints, the metric surpasses traditional performance indicators like mAP, providing a refined evaluation framework for autonomous vehicle deployment. Future research aims to extend this model to encompass broader environmental influences and cooperative perception systems, further enhancing its utility in real-time applications.