The content discusses the TLMAC framework for processing quantised neural networks on FPGAs. It introduces the concept of Table Lookup Multiply-Accumulate (TLMAC) to compile and optimize quantised neural networks for scalable lookup-based processing. The framework clusters unique groups of weights to enable highly parallel computation, reducing LUT utilization and routing congestion. TLMAC significantly improves scalability compared to previous methods, allowing implementation of ImageNet-scale models on commercially available FPGAs.
Naar een andere taal
vanuit de broninhoud
arxiv.org
Belangrijkste Inzichten Gedestilleerd Uit
by Daniel Gerli... om arxiv.org 03-19-2024
https://arxiv.org/pdf/2403.11414.pdfDiepere vragen