This work presents a novel non-volatile spin transfer torque (STT) assisted spin-orbit torque (SOT) based ternary content addressable memory (TCAM) with 5 transistors and 2 magnetic tunnel junctions (MTJs) for hardware accelerators.
Existing neural recording systems struggle to handle the exponential growth in neural data due to power and storage constraints. This work proposes a co-design approach of accelerators and storage, with swapping as a primary design goal, to overcome these limitations.
HURRY, a reconfigurable and multifunctional ReRAM-based in-situ accelerator, enhances spatial and temporal utilization of ReRAM arrays to achieve significant performance, energy, and area efficiency improvements over existing ReRAM-based accelerators.
Omni 3D is a 3D-stacked device architecture that efficiently integrates power, signal, and clock routing with BEOL-compatible transistors, enabling improved energy efficiency and area utilization compared to state-of-the-art complementary FET (CFET) designs.
T10 is a deep learning compiler that efficiently utilizes the distributed on-chip memory and high inter-core communication bandwidth of emerging intelligence processors to scale deep learning computation.
The Bicameral Cache is a novel cache design that segregates scalar and vector data references to optimize performance on vector architectures by preserving the spatial locality of vector data and avoiding interference between scalar and vector accesses.
Cambricon-LLM, a novel chiplet-based hybrid architecture, enables efficient on-device inference of large language models up to 70 billion parameters by combining a neural processing unit (NPU) and a dedicated NAND flash chip with on-die processing capabilities.
Piper, a hardware accelerator, can significantly improve the performance and efficiency of tabular data preprocessing pipelines for machine learning, achieving up to 71.3x speedup over optimized CPU baselines and up to 20.3x over GPUs.
MICSim은 혼합 신호 메모리 기반 AI 가속기의 소프트웨어 성능과 하드웨어 오버헤드를 조기에 평가할 수 있는 오픈 소스 사전 회로 시뮬레이터이다.
MICSim is an open-source, modular simulator that enables early-stage evaluation of software performance and hardware overhead for mixed-signal compute-in-memory (CIM) accelerators targeting both CNNs and Transformers.