Core Concepts
Developing a unified semantic schema for Portable Executable (PE) malware files to enhance interpretability and reproducibility in malware detection.
Abstract
Ontologies are crucial in information security, particularly in malware detection. The PE Malware Ontology aims to provide a standardized schema for PE-malware datasets, improving interpretability and comparability of experiments. Features like file characteristics, section properties, and actions are represented in the ontology. Derived features are annotated for identification. Datasets of various sizes have been generated from EMBER data to support concept-learning algorithms efficiently.
Stats
Approx. 1.1 million samples in EMBER dataset
Approx. 20 million samples in SoReL dataset
195 classes, 6 object properties, and 9 data properties in the ontology
Datasets ranging from 1000 to 800000 samples with corresponding properties and assertions