toplogo
Увійти
ідея - Software Testing - # Coverage-Guided Test Generation using Large Language Models

Code-Aware Prompting: Improving Coverage-Guided Test Generation for Complex Software Units using Large Language Models


Основні поняття
SymPrompt, a novel code-aware prompting strategy, enables large language models to generate high coverage test suites for complex software units by deconstructing the test generation process into a multi-stage sequence of prompts aligned with the execution paths of the method under test.
Анотація

The paper presents SymPrompt, a novel approach to generating test cases for software units using large language models (LLMs). The key insight is that the process of generating a test suite to fully cover a focal method can be broken down into a sequence of logical problems that are posed to the LLM prompts.

The approach consists of three main steps:

  1. Collecting Approximate Path Constraints: The abstract syntax tree of the focal method is traversed to collect constraints on each possible execution path and the corresponding return values. This avoids the computational overhead of traditional symbolic execution.

  2. Capturing Relevant Context: The type and dependency context of the focal method, including parameter types, external library calls, and class definitions, is extracted and included in the prompts to guide the model.

  3. Generating Prompts and Tests: For each execution path, a prompt is constructed using the collected path constraints and context. The prompts are used to iteratively prompt the LLM to generate test cases targeting specific execution paths.

The evaluation shows that SymPrompt significantly improves the performance of test generation using both an open-source 16B parameter model (CodeGen2) and the large GPT-4 model. Compared to baseline prompting strategies, SymPrompt improves the ratio of correct test generations by a factor of 5 and boosts relative coverage by 26% for CodeGen2. When applied to GPT-4, SymPrompt improves coverage by over 2x.

The key benefits of SymPrompt are:

  1. Guiding the model to generate tests that correctly call the focal method and pass.
  2. Generating higher coverage test suites that exercise more distinct execution paths in the focal method.

The approach overcomes limitations of traditional symbolic execution by using approximations and relying on the LLM's ability to reason about code context, while also avoiding the computational overhead of enumerating all possible execution paths.

edit_icon

Налаштувати зведення

edit_icon

Переписати за допомогою ШІ

edit_icon

Згенерувати цитати

translate_icon

Перекласти джерело

visual_icon

Згенерувати інтелект-карту

visit_icon

Перейти до джерела

Статистика
The paper does not contain any key metrics or figures that need to be extracted.
Цитати
The paper does not contain any striking quotes that need to be extracted.

Ключові висновки, отримані з

by Gabriel Ryan... о arxiv.org 04-04-2024

https://arxiv.org/pdf/2402.00097.pdf
Code-Aware Prompting

Глибші Запити

How can SymPrompt be extended to handle dynamic method dispatch and polymorphism in object-oriented programs

SymPrompt can be extended to handle dynamic method dispatch and polymorphism in object-oriented programs by incorporating dynamic analysis techniques. This extension would involve analyzing the runtime behavior of the program to determine the actual type of objects at runtime and the corresponding method calls. By dynamically tracking the types of objects and method invocations during program execution, SymPrompt can generate prompts that consider the polymorphic behavior of the program. Additionally, SymPrompt can utilize runtime profiling information to guide the generation of test cases that cover different branches of the polymorphic method calls.

What are the limitations of SymPrompt's approach compared to traditional symbolic execution techniques, and how can they be addressed

The limitations of SymPrompt's approach compared to traditional symbolic execution techniques include: Approximate Path Constraints: SymPrompt relies on collecting approximate path constraints based on static code analysis, which may not capture all possible execution paths accurately. This limitation can lead to incomplete coverage of the focal method. Computational Overhead: SymPrompt may face computational overhead when dealing with complex branching conditions and numerous paths, as it iteratively constructs prompts for each execution path. This can impact the scalability and efficiency of the test generation process. Dependency on Language Models: SymPrompt's effectiveness is dependent on the quality and training data of the language model used. If the language model lacks domain-specific knowledge or training data, it may struggle to generate accurate test cases. These limitations can be addressed by: Improving Path Constraint Collection: Enhancing the accuracy of path constraints by incorporating dynamic analysis techniques to capture runtime behavior and actual execution paths. Optimization Techniques: Implementing optimization techniques to reduce computational overhead, such as path minimization algorithms or parallel processing for prompt generation. Model Training and Fine-Tuning: Continuously updating and fine-tuning the language model used by SymPrompt with domain-specific data to improve its understanding of code semantics and context.

How can the insights from SymPrompt be applied to other software engineering tasks beyond test generation, such as program synthesis or program repair

The insights from SymPrompt can be applied to other software engineering tasks beyond test generation in the following ways: Program Synthesis: SymPrompt's approach of constructing prompts based on path constraints and context can be leveraged in program synthesis tasks. By guiding language models to generate code snippets or programs that adhere to specific constraints and requirements, SymPrompt-like techniques can assist in automated program synthesis. Program Repair: In the context of program repair, SymPrompt's methodology of deconstructing the problem into multi-stage prompts aligned with the execution paths can be utilized to guide models in generating patches or fixes for buggy code. By providing targeted prompts based on the faulty paths in the code, models can generate more accurate and effective repairs.
0
star