Core Concepts
A multi-simulator approach called digital siblings can better predict the failures of DNN-based lane-keeping models in a digital twin, compared to using a single simulator.
Abstract
The paper proposes a novel approach called digital siblings (DSS) to improve simulation-based testing of autonomous driving software. The key idea is to use multiple general-purpose simulators (GPSims) collectively as an ensemble, rather than relying on a single GPSim, to better approximate the behavior of the autonomous vehicle (AV) in a high-fidelity digital twin (DT).
The authors focus on testing the lane-keeping component of an AV, implemented using deep neural networks (DNNs). They use two open-source simulators, BeamNG and Udacity, as the digital siblings, and a digital twin of a physical 1:16 scale electric AV as the ground truth.
The approach involves the following steps:
Training or fine-tuning the DNN lane-keeping model to run on both digital siblings and the digital twin.
Generating test scenarios (sequences of road points) for each digital sibling using an evolutionary search algorithm (DeepHyperion).
Migrating the test cases across the digital siblings and merging their outcomes to obtain a unified view (digital siblings feature map).
Executing the test cases from the digital siblings feature map on the digital twin to obtain the ground truth feature map.
Analyzing the correlation between the digital siblings feature map and the digital twin feature map to assess the capability of the digital siblings in predicting the failures of the DNN lane-keeping model on the digital twin.
The empirical evaluation shows that the ensemble failure predictor by the digital siblings is superior to each individual simulator at predicting the failures of the digital twin. The authors discuss the findings and provide recommendations for researchers and software engineers interested in automated testing of autonomous driving software.
Stats
The mean squared error (MSE) between the predicted steering angle and the ground truth steering angle on the digital twin is 0.08 for the models trained on simulated images (MS) and 0.07 for the models trained on pseudo-real images (MR).
The success rate of the lane-keeping models on the digital twin is 0.69 for MS and 0.95 for MR.
Quotes
"Simulation-based testing represents an important step to ensure the reliability of autonomous driving software."
"Using a single-simulator approach for AV testing might be unreliable, as the testing results are highly dependent on the chosen GPSim."
"The ensemble failure predictor by the digital siblings is superior to each individual simulator at predicting the failures of the digital twin."