SafeVL: Driving Safety Evaluation via Meticulous Reasoning in Vision Language Models

Abstract

Safety remains a fundamental challenge in autonomous driving, with a key step being the development of a safety evaluator that can reliably identify unsafe (i.e., collision-prone) scenarios. Existing methods, however, either rely heavily on object trajectories or use only language-based reasoning, neglecting crucial visual cues and limiting their generalization to unsafe events. Vision–Language Models (VLMs) have recently shown strong generalization across various autonomous driving tasks, yet their application to safety evaluation remains limited due to scarce unsafe driving data and insufficient instance-level visual grounding. In this work, we present SafeVL, a VLM-based safety evaluator for autonomous driving that takes video as input, produces structured chain-of-thought reasoning traces, and ultimately outputs a safe/unsafe decision. Our framework consists of two key components: (1) a Road-Graph Counterfactual Data Generation Engine, which synthesizes diverse counterfactual unsafe scenarios, and (2) an Object-centric Visual Reasoning Framework, which fuses counterfactual unsafe scenarios with existing safe driving datasets for safety prediction. We conduct comprehensive experiments on the Nexar real-world collision dataset and show that SafeVL achieves 76% accuracy in the zero-shot setting, representing a 20% improvement over existing models. Finally, we integrate SafeVL into an end-to-end driving policy (UniAD) as a planning trajectory filter, reducing closed-loop collision rates by 8% on the NeuroNCAP benchmark, demonstrating its downstream practical benefits for safer autonomous driving.

Publication
Submitted

Related