แนวคิดหลัก
A novel, scalable, and black-box test suite minimization approach (LTM) that leverages large pre-trained language models and vector-based similarity measures to efficiently identify and remove redundant test cases while maintaining high fault detection capability.
บทคัดย่อ
The paper proposes LTM, a novel test suite minimization approach that addresses the scalability limitations of the state-of-the-art approach (ATM) by utilizing large pre-trained language models (LLMs) and vector-based similarity measures.
Key highlights:
LTM takes the source code of test cases as input, without requiring any preprocessing, and employs five different LLMs (CodeBERT, GraphCodeBERT, UniXcoder, StarEncoder, and CodeLlama) to generate test method embeddings.
LTM uses two vector-based similarity measures, Cosine Similarity and Euclidean Distance, to calculate the similarity between test method embeddings, which is more computationally efficient than the tree-based similarity measures used in ATM.
LTM employs a Genetic Algorithm (GA) to minimize test suites using the calculated similarity values as fitness.
LTM optimizes the GA search by utilizing a more efficient data structure to accelerate fitness calculation and enhance memory usage, leading to a 273-fold reduction in minimization time.
Experimental results on 17 Java projects with 835 versions show that the best configuration of LTM (UniXcoder/Cosine) outperforms ATM by achieving a slightly greater saving rate of testing time (41.72% versus 40.29%, on average), attaining a significantly higher fault detection rate (0.84 versus 0.81, on average), and minimizing test suites nearly five times faster on average, with higher gains for larger test suites and systems, thus achieving much higher scalability.
สถิติ
The average test execution time before test suite minimization is 1.58 minutes.
The average test execution time after test suite minimization using the best LTM configuration is 0.92 minutes.
คำพูด
"LTM achieves high FDR results (an overall average FDR of 0.79 across configurations) for a 50% minimization budget (i.e., the percentage of test cases retained in the minimized test suite)."
"The best configuration of LTM is UniXcoder using Cosine similarity when considering both effectiveness (0.84 FDR on average) and efficiency (0.82 min on average), which also achieves a greater time saving rate (an average TSR of 41.72%)."
"For the large project, Closure, UniXcoder using Cosine Similarity takes only 17.90 min in terms of MT and achieves an FDR of 0.79, while saving 52.55% of testing time."