This paper presents a systematic survey of testing techniques for deep learning (DL) libraries. It first introduces the workflow of DL libraries and defines DL library bugs and testing. The paper then categorizes existing DL library testing research into three components: DL framework testing, DL compiler testing, and DL hardware library testing.
For DL framework testing, the paper summarizes empirical studies that have analyzed the characteristics and root causes of DL framework bugs. It then discusses differential testing, fuzz testing, and metamorphic testing methods that have been proposed to detect bugs in DL frameworks like TensorFlow and PyTorch. These methods focus on discovering status, numerical, and performance bugs.
For DL compiler testing, the paper highlights that existing methods focus on detecting optimization bugs that can cause semantic changes during the compilation process. These methods often target the model loading, high-level IR transformation, and low-level IR transformation stages of DL compilers.
For DL hardware library testing, the paper notes that existing research mainly validates the functionality of DL hardware libraries using metamorphic testing and test pattern generation, as it is challenging to generate valid test inputs and construct test oracles for these low-level libraries.
The paper also discusses the differences between DL library testing and DL model testing, and outlines the main challenges and future research directions in DL library testing, such as the need for more comprehensive testing of security-related properties and the development of general and systematic testing methods.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Xiaoyu Zhang... at arxiv.org 04-30-2024
https://arxiv.org/pdf/2404.17871.pdfDeeper Inquiries