Core Concepts
Optimizing the OpenMP algorithm for computing all-pairs shortest paths on x86 architectures leads to significant performance improvements.
Abstract
The content discusses the optimization of the Floyd-Warshall algorithm for computing all-pairs shortest paths on x86 architectures using OpenMP. It explores adaptations made to the code developed for Xeon Phi KNL processors to run on Intel x86 processors. The study includes various optimizations, performance analyses on different Intel servers, and a new proposal to increase concurrency in the parallel algorithm. Experimental results, comparisons, and future work are detailed.
Structure:
Abstract
Introduction
Background
Intel Xeon Phi
Intel Xeon and Core
FW Algorithm
Base code
Implementation
Code adaptation to x86 architectures
Opt-9: Intra-round concurrency
Experimental Results
Experimental Design
Experimental results of x86 adaptation
Experimental results of Opt-9
Conclusions and Future Work
Conclusions
Future Work
References
Stats
FW algorithm requires O(n3) operations and O(n2) memory space.
New optimization proposal improved performance by up to 23%.
Opt-3 provided the greatest performance improvement.
Quotes
"All optimizations were beneficial on the two x86 platforms selected."
"Performance improves as N increases, given the higher ratio of compute versus synchronization."