Computer Architecture and Hardware

Bejelentkezés

betekintés - Computer Architecture and Hardware

Efficient Hardware Implementation of Nanosecond Regression Trees for Missing Transverse Momentum Estimation at the Large Hadron Collider

A highly efficient hardware implementation of boosted decision trees for regression on field programmable gate arrays (FPGAs) that can execute in under 10 nanoseconds, enabling real-time processing of missing transverse momentum at the Large Hadron Collider.

Comprehensive Evaluation of AI Workload Performance and Energy Efficiency on Diverse Hardware Accelerators using the CARAML Benchmark Suite

The CARAML benchmark suite provides a systematic, automated, and reproducible framework for evaluating the performance and energy consumption of transformer-based language models and computer vision models on a range of hardware accelerators, including NVIDIA, AMD, and Graphcore systems.

Energy-Efficient Capacitive-RRAM Content Addressable Memory with Parallel Search and Low-Power Operation

This work presents an energy-efficient 3T1R1C capacitive-RRAM content addressable memory (CAM) that uses RRAM as both storage and comparison element, eliminating the need for direct current paths and enabling low-power parallel search operations.

Embedded FPGA Developments in 130nm and 28nm CMOS for Reconfigurable Machine Learning in Particle Detector Readout

Embedded FPGA technology enables the implementation of reconfigurable logic within application-specific integrated circuits (ASICs), offering the low power and efficiency of an ASIC along with the ease of FPGA configuration, which is beneficial for machine learning applications in the data pipeline of next-generation collider experiments.

Probabilistic Interval Analysis for Programs Running on Unreliable Hardware

This paper proposes a probabilistic interval analysis technique to statically analyze programs that run on unreliable hardware architectures, where operations can fail with a certain probability.

Efficient LLM Inference using Custom Microscaling Formats: A Dataflow Compiler Approach

MASE, a novel compiler, automatically explores mixed-precision quantization using custom Microscaling (MX) formats to enable efficient dataflow hardware acceleration for large language models (LLMs) with minimal accuracy degradation.

Hardware-Aware Training and Deployment of Spiking Neural Networks with Optimized Synaptic Delays on Digital Neuromorphic Processors

This work proposes a hardware-aware training framework that co-optimizes synaptic weights and delays for deploying highly performing spiking neural network models on digital neuromorphic hardware platforms.

Rólunk

Termékek | Források

Észrevételek