This article surveys the key innovations in FPGA architecture to better support deep learning (DL) inference acceleration. It first provides an introduction to FPGA architecture and highlights the unique strengths of FPGAs that make them suitable for DL inference, such as fine-grained programmability, spatial computing, and flexible I/Os.
The article then discusses different styles of DL inference accelerators on FPGAs, ranging from model-specific dataflow architectures to software-programmable overlay designs. It showcases examples of these accelerator designs and how they leverage the underlying FPGA architecture to achieve state-of-the-art performance.
Next, the article delves into the specific FPGA architecture enhancements being proposed to better support DL workloads. These include optimizing the logic blocks, arithmetic circuitry, on-chip memories, and integrating new DL-specialized hardware blocks into the FPGA fabric. The article also covers emerging hybrid devices that combine FPGA-like reconfigurable fabrics with coarse-grained DL accelerator cores.
Finally, the article highlights promising future research directions in the area of reconfigurable computing for DL, such as exploiting processing-in-memory capabilities of on-chip memories and exploring novel reconfigurable architectures that combine the strengths of FPGAs and specialized DL accelerators.
翻譯成其他語言
從原文內容
arxiv.org
深入探究