Core Concepts
The ExBLAS approach can provide reproducible and accurate results for the pipelined Bi-Conjugate Gradient Stabilized (p-BiCGStab) method, avoiding the need for residual replacement techniques.
Abstract
This paper explores the use of the ExBLAS approach to ensure numerical reliability and accuracy in the pipelined Bi-Conjugate Gradient Stabilized (p-BiCGStab) method, as an alternative to the residual replacement technique. The key highlights are:
The BiCGStab and p-BiCGStab methods are introduced, with the latter optimizing for parallel performance by reducing communication bottlenecks. However, the mathematical equivalence of these methods can lead to divergent numerical results due to the non-associativity of floating-point operations.
To stabilize the deviation in p-BiCGStab, the residual replacement technique was previously proposed. This paper instead explores the use of the ExBLAS approach, which combines long accumulators and floating-point expansions to provide reproducible and accurate results.
Numerical experiments are conducted on a set of sparse matrices from the SuiteSparse Matrix Collection. The results show that the p-BiCGStabExBLAS method consistently outperforms the conventional p-BiCGStab in terms of convergence rates and numerical reliability, especially for higher tolerance levels (10^-9).
The ExBLAS implementation exhibits stable performance across different numbers of processes, unlike the residual replacement technique which can be sensitive to parameter choices and problem context.
The overhead of the p-BiCGStabExBLAS method diminishes as the number of processes increases, demonstrating its scalability and potential for efficient parallel implementation.
Overall, this study highlights the benefits of the ExBLAS approach in providing a reliable and accurate alternative to the residual replacement technique for the pipelined BiCGStab method, without sacrificing its parallel performance advantages.
Stats
The number of iterations required for the BiCGStab, p-BiCGStab, p-BiCGStabExBLAS, and p-BiCGStabRR methods to achieve convergence thresholds of 10^-6 and 10^-9 on various sparse matrices from the SuiteSparse Matrix Collection.
Quotes
"The pipelined BiCGStab method with ExBLAS consistently outperforms the regular pipelined variant in terms of iterations across a wide range of scenarios."
"Increasing the number of processes does not lead to a faster solution for pipelined BiCGStab method. The pipelined BiCGStabRR as indicated in Table 1 demonstrates its best outcome of 195 iterations for the bcsstk13 matrix."
"The overhead associated with p-BiCGStabExBLAS diminishes as the number of processes increases, dropping from 2.6x on a single process to 1.87x on 16 processes."