The research addresses the critical challenge of data scarcity in NIDS training datasets by integrating GANs into the NIDS framework. Three distinct GAN models (Vanilla GAN, Wasserstein GAN, and Conditional Tabular GAN) are implemented to generate synthetic network traffic data that closely resembles real-world anomalous behavior, specifically targeting the Botnet attack class.
The generated samples are extensively evaluated for their closeness and similarity to the original Botnet samples using various metrics and methodologies, including cosine similarity, cumulative sums, and machine learning algorithms. The generated Botnet samples are then integrated into the original CIC-IDS2017 dataset in varying quantities to train a Random Forest-based NIDS model.
The results demonstrate that the integration of GAN-generated samples significantly improves the NIDS performance in detecting Botnet attacks, with precision, recall, and F1-score reaching up to 1.00, 0.82, and 0.90 respectively. This represents a substantial enhancement compared to the baseline NIDS performance. The research establishes a new benchmark for Botnet classification on the CIC-IDS2017 dataset, outperforming previous state-of-the-art approaches.
The findings highlight the effectiveness of leveraging GANs to address the data scarcity challenge in NIDS and bolster the cybersecurity posture of organizations against evolving and sophisticated cyber threats.
翻譯成其他語言
從原文內容
arxiv.org
深入探究