The paper proposes a novel cyber threat intelligence (CTI) search technique that leverages attention graph isomorphism. The key insights are:
CTI reports have domain-specific semantics that are difficult to capture using general-purpose language models. The authors observe that the attention mechanism in Transformer models can effectively capture these domain-specific semantic correlations between words.
The authors extract semantically structured graphs from text using self-attention maps, where the graph construction prioritizes edges with higher attention scores. This allows them to abstract the core malware behaviors as sub-graphs.
The authors use sub-graph matching and similarity scoring to perform the CTI search. This approach outperforms existing techniques such as sentence embeddings and keyword-based methods.
The authors evaluate their method on a large dataset of CTI reports collected from various security vendors. Their technique achieves higher precision and recall compared to baselines, and it also helps in real-world attack forensics by correctly attributing the origins of 8 out of 10 recent attacks, while Google and IoC-based search can only attribute 3 and 2 attacks, respectively.
The authors also discuss the efficiency of their method, showing that their optimized implementation can perform the search in reasonable time, comparable to a simple word matching baseline.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문