System-Level Safety Monitoring and Recovery for Perception Failures in Autonomous Vehicles

Kaustav Chakraborty, Zeyuan Feng, Sushant Veer, Apoorva Sharma, Boris Ivanovic, Marco Pavone, Somil Bansal

September 2024

PDF Project

Abstract

The safety-critical nature of autonomous vehicle (AV) operation necessitates development of task-relevant algorithms that can reason about safety at the system level and not just at the component level. To reason about the impact of a perception failure on the entire system performance, such task-relevant algorithms must contend with various challenges: complexity of AV stacks, high uncertainty in the operating environments, and the need for real-time performance. To overcome these challenges, in this work, we introduce a Q-network called SPARQ (abbreviation for Safety evaluation for Perception And Recovery Q-network) that evaluates the safety of a plan generated by a planning algorithm, accounting for perception failures that the planning process may have overlooked. This Q-network can be queried during system runtime to assess whether a proposed plan is safe for execution or poses potential safety risks. If a plan is deemed unsafe, SPARQ can proactively trigger a recovery action to prevent safety violations, e.g., triggering AV to execute a fallback safe policy. We validate SPARQ’s ability to improve safety compared to baselines across two simulators including closed-loop settings involving complex multi-agent interactions: (i) CARLA, an urban autonomous driving simulator, and (ii) NVIDIA Isaac Sim, a photo-realistic simulator for autonomous systems. We demonstrate that integrating SPARQ with a planner improves safety by 5x while only incurring a 10% reduction in planner’s performance, providing a favorable trade-off between safety and performance. We further illustrate generalization capabilities of SPARQ to real-world scenarios by showing that SPARQ trained entirely in simulation achieves a 10% increase in safety when deployed on real-world data from nuPlan-Vegas dataset. SPARQ represents a step towards developing task-relevant safety algorithms that can unlock the full potential of AVs.

Type

Conference paper

Publication

ICRA 2025