A New General Tensor Accelerator with Improved Area Efficiency and Data Reuse
A new General Tensor Accelerator (GTA) architecture that combines systolic array and vector processing units to efficiently process tensor operators with arbitrary computational workload and precision.