Core Concepts
Language Models (LMs) can revolutionize automated analysis of security logs, as demonstrated by LogPrécis.
Abstract
Directory:
Introduction
Security analysts face challenges in analyzing security logs.
Language Models (LMs) offer potential solutions.
Background and Related Work
LM evolution from statistical techniques to deep neural architectures.
Transformer architecture key in PLMs.
LM Pipeline and Design Choices
Input strategies: Commands, Statements, Sessions.
Downstream Classification Tasks: Entity Recognition, MITRE Tactics as Class Labels.
Design Choices: Chunking Strategy, Domain Adaptation, PLMs and Tasks comparison.
LogPrécis Design and Evaluation
Datasets used for training and inference.
Labelling Process for supervised learning.
Comparison of design choices like pre-training, chunking strategy, domain adaptation.
Performance Metrics comparison with other LMs like W2V and GPT-3.
Stats
"LogPr´ecis reduces the analysis to about 3,000 unique fingerprints."
"CodeBERT has 130M parameters."
"GPT-3 Davinci costs 105.65 USD for fine-tuning and testing."