Core Concepts
Large language models can automate cyber-attacks, transforming security operations and posing economic risks. The AUTOATTACKER system leverages LLMs for post-breach attacks.
Abstract
The AUTOATTACKER system utilizes Large Language Models (LLMs) to automate "hands-on-keyboard" attacks in cybersecurity. It addresses challenges such as tracking the victim environment, generating precise attack commands, and optimizing action selection. Extensive testing shows GPT-4's remarkable capabilities in automating post-breach attacks with limited human involvement.
Key points include:
- LLMs are increasingly used in cybersecurity applications.
- AUTOATTACKER automates complex attack tasks using LLMs.
- Challenges include bypassing usage policies and ensuring accurate command generation.
- The experience manager stores successful actions for reuse.
- Evaluation metrics measure adaptability, stealthiness, and impact of attack tasks.
- Results show high success rates with GPT-4 at low temperatures.
Stats
GPT-3.5 achieves a 1/3 success rate for File Writing task at T=0 temperature.
GPT-4 completes all tasks successfully at T=0 temperature.
GPT-4 has an average interaction of 5.3 for Privilege Escalation task.
Quotes
"An automated LLM-based, post-breach exploitation framework can help analysts quickly test and continually improve their organization’s network security posture."
"AUTOATTACKER contains modules like summarizer, planner, navigator to optimize LLM interactions."
"GPT-4 demonstrates remarkable capabilities in automatically conducting post-breach attacks requiring limited or no human involvement."