Core Concepts
Matrix operations can be transformed into equivalent graph representations, enabling domain experts to implement various types of matrix computations using a unified graph programming interface. This graph engine-based scientific computing paradigm achieves performance comparable to the best-performing implementations while greatly simplifying the development of scientific computations on large-scale platforms.
Abstract
The content discusses a new programming paradigm called G4S (Graph for Science) that simplifies the programming of high-performance scientific computing routines by transforming matrix operations into equivalent graph representations.
Key highlights:
- Matrix operations, which represent the dominant cost of many scientific application domains, typically require extensive efforts from HPC specialists to support high-performance execution on large-scale computing platforms.
- The G4S paradigm observes that matrix computations can be transformed into equivalent graph representations, and by utilizing graph processing engines, HPC experts can be freed from the burden of implementing efficient scientific computations.
- G4S provides a unified graph programming interface, enabling domain experts to promptly implement various types of matrix computations. The underlying graph processing engine achieves efficient execution, eliminating the need for HPC expertise.
- The G4S-based implementations achieve performance comparable to the best-performing implementations based on existing parallel computing libraries and bespoke implementations, while greatly simplifying the development of scientific computations on large-scale platforms.
- The content introduces the design of the M2G tool, which automatically transforms matrix operations into graph operations, and the code mapping mechanism that determines the optimal strategies to run graph operations on the underlying graph engines.
- Experimental results demonstrate the effectiveness of the G4S paradigm in implementing a diverse set of matrix operations and real-world scientific computing routines from the domains of geodynamics, molecular dynamics, and chemical kinetics.
Stats
The time for matrix multiplication and matrix addition accounts for more than 90% of the total execution time of LINPACK.
There can be a more than 100 times performance gap between optimal and arbitrary implementations of matrix operations on large-scale computing platforms.
Quotes
"Matrix operations represent the dominant cost of many scientific application domains, because these routines typically need to solve numerical integral equations, differential equations, etc."
"To achieve the optimized performance, the users have to heavily rely on their experience to i) select the use of the library that can best realize the characteristics of the matrix operations in question, ii) calls the intricate APIs provided by the library, and iii) on a case-by-case basis, optimize the implementation of scientific computing routine on large-scale computing platform with heterogeneous processing units."