Core Concepts
Empirical evaluation of the tradeoffs between low-level and high-level differentiation approaches for linear solvers, demonstrating that high-level differentiation is generally preferable but low-level differentiation can be effective for certain solvers.
Abstract
The article examines the tradeoffs between low-level and high-level differentiation strategies when applying automatic differentiation (AD) to computer programs containing calls to linear solvers.
The key highlights are:
Previous publications have advised against differentiating through the low-level solver implementation, and instead advocated for high-level approaches that express the derivative in terms of a modified linear system. However, the accuracy of both approaches has not been empirically compared.
The authors implemented low-level differentiation by applying the Tapenade AD tool to the SPARSKIT implementation of GMRES, TFQMR and BiCGStab solvers. They also implemented high-level differentiation at the matrix calculus level.
Experiments were conducted on 65 matrices from the SuiteSparse collection, comparing the performance of the original, undifferentiated solvers with the low-level and high-level differentiation strategies.
The results show that high-level differentiation generally performs nearly as well as the original solver, but there are typically a few problems that require more iterations to achieve similar levels of accuracy.
The effectiveness of low-level differentiation is highly solver-dependent. For TFQMR and restarted GMRES, the low-level differentiation strategy is nearly as effective as high-level differentiation. However, for BiCGStab, there is a significant gap in performance between high-level and low-level differentiation.
The authors conclude that the common advice to use high-level differentiation is justified, but a careful solver choice may lead to useful gradients even with low-level approaches in certain situations.
Stats
The L2 norm of the difference between the computed x (or ∂x/∂u) and the reference value is less than 10^-2 or 10^-4.
Quotes
"Despite this ubiquitous advice, we are not aware of prior work comparing the accuracy of both approaches."
"We demonstrate with this article that the common advice is justified, and that high-level differentiation is indeed usually preferable to low-level differentiation."