Optimizing GEMM Acceleration on Leading AI-Optimized FPGAs: Versal ACAP and Stratix 10 NX
This work presents novel systematic frameworks to optimize the performance of General Matrix Multiplication (GEMM), a fundamental operation in Deep Learning workloads, by exploiting the unique and distinct architectural characteristics of the Versal ACAP and Stratix 10 NX FPGA platforms.