Efficient FPGA Accelerator for Lightweight Convolutional Neural Networks with Balanced Dataflow
A novel streaming architecture with hybrid computing engines and a balanced dataflow strategy is proposed to efficiently accelerate lightweight convolutional neural networks by minimizing on-chip memory overhead and off-chip memory access while enhancing computational efficiency.