Core Concepts
A new role similarity metric, ForestSim, is proposed that can efficiently process top-k similarity search on large networks by leveraging spanning rooted forests of graphs.
Abstract
The paper proposes a novel node similarity metric called ForestSim, which is based on spanning rooted forests of graphs. ForestSim captures the structural properties of a node by using the average size of the trees rooted at that node in the spanning rooted forests.
The key highlights are:
ForestSim is proven to be an admissible role similarity metric, satisfying important axiomatic properties like automorphism confirmation.
The authors devise an efficient top-k similarity search algorithm called ForestSimSearch, which can process a top-k query in O(k) time once the precomputation is finished.
To speed up the precomputation, the authors use a fast approximate algorithm to compute the diagonal entries of the forest matrix, reducing the time and space complexity to nearly linear.
Extensive experiments on 26 real-world networks show that ForestSim achieves comparable performance to state-of-the-art role similarity metrics like RoleSim, while being the only one that can efficiently handle top-k queries on million-scale networks.
Stats
The paper does not contain any explicit numerical data or statistics to support the key logics. The focus is on the theoretical properties of the proposed ForestSim metric and the algorithmic details of the top-k similarity search.