Enhancing GEMM Acceleration on a Loosely-Coupled Multi-core Processor with Tile-based Instruction Set and Predictive Address Translation
MACO, a novel loosely-coupled multi-core general-purpose architecture, integrates multiple CPU+GEMM Acceleration Engines (MMAEs) and employs a tile-based instruction set and predictive address translation to enhance the flexibility, programmability, and computational efficiency for GEMM-related applications.