Core Concepts
A cost-aware specialization approach for chiplet-based spatial accelerators that explores the tradeoffs between performance, power, and fabrication cost.
Abstract
The paper proposes Monad, a cost-aware specialization approach for chiplet-based spatial accelerators. It introduces a modeling framework that considers the non-uniformity in dataflow, pipelining, and communications when executing multiple tensor workloads on different chiplets. The paper also proposes to combine the architecture and integration design space by uniformly encoding the design aspects for both spaces and exploring them with a systematic ML-based approach.
The key highlights are:
The paper presents a cost-aware design approach to make comprehensive tradeoffs for a chiplet-based accelerator, considering performance, power, and fabrication cost.
It proposes a modeling framework to evaluate a chiplet system with specialized architecture and interconnects, capturing the non-uniformity in dataflow, pipelining, and communications.
An ML-based co-optimization framework is developed to couple the architecture and integration design space, enabling joint exploration.
Experiments demonstrate an average of 16% and 30% energy-delay-product (EDP) reduction compared to the state-of-the-art chiplet-based accelerators, Simba and NN-Baton, respectively.
Stats
The paper reports the following key metrics:
16% average EDP reduction compared to Simba
30% average EDP reduction compared to NN-Baton
8% average energy reduction compared to Simba
20.8% average energy reduction compared to NN-Baton
24% less latency or 16% less energy compared to the best of separate architecture or integration optimization
Quotes
"We achieve an average of 8% and 20.8% energy reduction compared with Simba [25] and NN-Baton [28], respectively."
"We achieve 24% less latency or 16% less energy compared to the best of separate architecture or integration optimization."