Core Concepts
Automated fault localization tools like FuseFL enhance fault localization results by providing step-by-step reasoning for code errors.
Abstract
Fault localization is a critical process in software debugging, involving identifying specific program elements responsible for failures. Various tools have been developed to automate this process, but simply ranking program elements based on suspiciousness is not enough. Providing explanations for flagged code elements is crucial. Large Language Models (LLM) like FuseFL combine information such as spectrum-based fault localization results, test case outcomes, and code descriptions to improve fault localization. In a study using faulty code from the Refactory dataset, FuseFL outperformed other SBFL techniques and XAI4FL in localizing faults at the Top-1 position by significant margins. Human evaluations showed that FuseFL generated correct explanations in 22 out of 30 cases and achieved high scores for informativeness and clarity comparable to human-generated explanations.
Stats
Our results demonstrate a 32.3% increase in successfully localized faults at Top-1 compared to the baseline.
BLEURT score of 0.492 was achieved for the Top-1 explanation.
Correctness, clarity, and informativeness scores were 5.7 and 5.9 on a 7-level Likert scale.