Exploiting Temporal Data Diversity for Detecting Safety-critical Faults in AV Compute Systems

Silent data corruption caused by random hardware faults in autonomous vehicle (AV) computational elements is a significant threat to vehicle safety. Previous research has explored design diversity, data diversity, and duplication techniques to detect such faults in other safety-critical domains. However, these are challenging to use for AVs in practice due to significant resource overhead and design complexity. We propose, DiverseAV, a low-cost data-diversity-based redundancy technique for detecting safety-critical random hardware faults in computational elements. DiverseAV introduces data-diversity between the redundant agents by exploiting the temporal semantic consistency available in the AV sensor data. DiverseAV is a black-box technique that offers a plug-and-play solution as it requires no knowledge of the internals of the AI agent responsible for executing driving decisions, requiring little to no modification to the agent itself for achieving high coverage of transient and permanent hardware faults. It is commercially viable because it avoids software modifications to agents that are costly in terms of development and testing time. Specifically, DiverseAV distributes the sensor data between the two software agents in a round-robin manner. As a result, the sensor data for two consecutive time steps are semantically similar in terms of their worldview but significantly different at the bit level, thus ensuring the state and data diversity between the two agents necessary for detecting faults. We demonstrate DiverseAV using an open-source self-driving AI agent which is controlling a car in an open-source world simulator.

Authors

Saurabh Jha (IBM)

Shengkun Cui (NVIDIA)

Timothy Tsai (NVIDIA)

Siva Hari

Michael B. Sullivan

Zbigniew T. Kalbarczyk (UIUC)

Steve Keckler

Ravishankar K. Iyer (UIUC)

Publication Date

Monday, June 27, 2022

Published in

International Conference on Dependable Systems and Networks (DSN)

Research Area

Autonomous Vehicles

Computer Architecture

Resilience and Safety

External Links

IEEE Digital Library

Uploaded Files

Published Manuscript6.33 MB

Copyright

This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org.