OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models at ICLR 2024
Large language models face challenges in deployment due to computational and memory requirements, but OmniQuant introduces an efficient quantization technique for diverse settings.