핵심 개념
Training a supervised model with noise data improves the detection of unknown attacks in IDS.
초록
The rapid expansion of network systems has led to increased cyber threats, necessitating effective Intrusion Detection Systems (IDS). Traditional supervised models struggle to detect unknown attacks due to evolving attack patterns. To address this, training a Random Forest (RF) model with noise data enhances the identification of unseen attacks. Experimental results show improved accuracy and F1-score when RF is trained with noise data. Synthetic datasets demonstrate the effectiveness of RF in detecting unknown attacks. Benchmark IDS datasets also exhibit enhanced performance when RF is trained with noise data.
통계
The shape of NSL-KDD dataset is (148517, 44).
UNSW-NB15 dataset contains 257,673 records and 45 fields.
CIC-IDS2017 dataset consists of 222914 records and 78 features.
CIC-DDoS2019 dataset has a shape of (431371, 79) with 333540 attack instances.
Malmem2022 dataset is balanced with Spyware, Ransomware, and Trojan Horse categories.
ToN-IoT-Network and ToN-IoT-Linux datasets contain heterogeneous telemetry IoT services.
ISCXURL2016 dataset has a shape of (36707, 80).
CIC-Darknet2020 dataset has 141530 records with 85 columns features.
XIIoTID dataset has an initial shape of (596017, 64) which increases to 81 after one-hot encoding.
인용구
"Most unseen attacks are detected as attacks."
"All instances of attack type 2 and most of attack type 1 are correctly identified because of the noise data."