Core Concepts
DLAP, a Deep Learning Augmented Large Language Model Prompting framework, combines the strengths of deep learning models and large language models to achieve exceptional performance in software vulnerability detection.
Abstract
The paper proposes DLAP, a Deep Learning Augmented Large Language Model Prompting framework, to address the limitations of existing approaches for software vulnerability detection.
Key highlights:
- DLAP leverages the advantages of both deep learning (DL) models and large language models (LLMs) to achieve superior vulnerability detection performance.
- DLAP uses two prompt techniques:
- In-Context Learning (ICL) prompts that incorporate detection probabilities from a pre-trained DL model to stimulate implicit fine-tuning of LLMs.
- Chain-of-Thought (COT) prompts that combine results from static analysis tools and DL models to generate customized prompts for LLMs.
- Experiments on four large-scale software projects show that DLAP outperforms state-of-the-art prompting frameworks and fine-tuning techniques in terms of detection accuracy, cost-effectiveness, and explainability.
- The paper conducts a rigorous analysis to determine the most suitable DL model to integrate with DLAP, finding that the Linevul model achieves the best performance.
Overall, DLAP demonstrates the effectiveness of combining DL and LLMs through prompt engineering to address the challenges of software vulnerability detection.
Stats
"Software vulnerability detection is paramount for safeguarding system security and individual privacy."
"Many automated static analysis tools (ASATs) have been applied for vulnerability detection."
"DL models that perform well on experimental datasets may suffer from severe performance degradation in real-world projects."
"LLMs have not achieved satisfactory results in vulnerability detection."
Quotes
"DLAP, a Deep Learning Augmented Large Language Model Prompting framework, combines the advantages of DL models and LLMs while overcoming their respective shortcomings."
"Experiments on four large-scale software projects show that DLAP outperforms state-of-the-art prompting frameworks and fine-tuning techniques in terms of detection accuracy, cost-effectiveness, and explainability."
"The paper conducts a rigorous analysis to determine the most suitable DL model to integrate with DLAP, finding that the Linevul model achieves the best performance."