Core Concepts
Despite the promising advancements of AI-powered binary code similarity detection (BinSD) techniques, particularly those based on graph neural networks (GNNs), there remains significant room for improvement, especially in addressing the "embedding collision" problem and enhancing their performance in real-world applications like vulnerability search.
Stats
The GNN-based BinSD approaches achieve top-level ranking metrics in the current literature.
Compared to mono-ISA, recall values decrease significantly in cross-ISA. For instance, the recall@5 value of Gemini-skip decreases from 62.3% to 25.42% when the evaluation setting changes from mono-seen to cross-seen.
Though BinaryAI-bert2 achieves the best AUC (99.2%) and ACC (94.9%), its precision (32.21%) is 12.3% less than Gemini-skip.