Core Concepts
A highly optimized parallel implementation of the Louvain algorithm for community detection in large graphs, achieving up to 50x speedup over state-of-the-art approaches.
Abstract
The content presents a parallel implementation of the Louvain algorithm, called GVE-Louvain, for efficient community detection in large graphs. Key highlights:
GVE-Louvain employs several optimizations to improve performance, including:
Asynchronous computation
Parallel prefix sum and preallocated Compressed Sparse Row (CSR) data structures for identifying community vertices and storing the super-vertex graph
Fast collision-free per-thread hash tables for the local-moving and aggregation phases
Aggregation tolerance to avoid unnecessary aggregation phases
Established techniques like OpenMP's dynamic loop schedule, limiting iterations per pass, threshold-scaling optimization, and vertex pruning
Evaluation on a server with dual 16-core Intel Xeon Gold 6226R processors shows that GVE-Louvain outperforms other state-of-the-art Louvain implementations by 50x, 22x, and 20x on average, achieving a processing rate of 560 million edges/s on a 3.8 billion edge graph.
GVE-Louvain also exhibits a 1.6x performance improvement for every doubling of threads, demonstrating its scalability on shared memory systems.
Stats
The total number of vertices (|V|) ranges from 3.07 million to 214 million, and the total number of edges (|E|) ranges from 25.4 million to 3.8 billion across the evaluated graphs.
Quotes
"GVE-Louvain outperforms Vite, Grappolo, and NetworKit Louvain by 50×, 22×, and 20× respectively - achieving a processing rate of 560𝑀edges/s on a 3.8𝐵edge graph."
"GVE-Louvain improves performance at an average rate of 1.6× for every doubling of threads."