The paper proposes a framework called QCore to address the challenges of deploying and continually calibrating quantized classification models on edge devices with limited resources.
Key highlights:
QCore compresses the full training data set into a small subset, called QCore, that can effectively calibrate quantized models with different bit-widths. This reduces the data requirements compared to using the full training set.
QCore is designed to be updated with new streaming data, enabling continual calibration that adapts the model to changes in the environment.
The authors introduce a lightweight bit-flipping network that can efficiently update the quantized model parameters without requiring expensive back-propagation. This bit-flipping network is trained alongside the main quantized model.
The integration of QCore and the bit-flipping network enables continual calibration of quantized models on edge devices, addressing the limitations of existing approaches that rely on full training data and back-propagation.
Experiments show that the proposed QCore framework can outperform strong baseline methods in a continual learning setting, demonstrating its effectiveness for edge device deployments.
他の言語に翻訳
原文コンテンツから
arxiv.org
深掘り質問