Core Concepts
Large Language Models can outperform human experts in Preliminary Security Risk Analysis.
Abstract
Introduction to the importance of Preliminary Security Risk Analysis (PSRA) in mission-critical contexts.
Comparison between human experts and Fine-Tuned Models (FTM) in PSRA proficiency.
Methodology detailing the study design, research questions, and data collection.
Results showing FTM outperforming human experts in accuracy metrics and evaluation time.
Discussions on the implications of the findings and the benefits of leveraging FTM in PSRA.
Threats to validity categorized into Conclusion, Internal, Construct, and External validity.
Conclusions highlighting the effectiveness of FTM in reducing errors and accelerating risk detection in PSRA.
Stats
"FTM consistently outperforms the baseline, i.e., GPLLM, in each accuracy metric."
"FTM exhibits high precision in both average types, suggesting low rates of false positives."
"FTM achieves a weighted recall of 0.8814, suggesting it can effectively discover preliminary security risks with a low rate of false negatives."
"FTM outperforms six of seven human experts in all accuracy metrics, number of errors, and analysis time."