Temel Kavramlar
TabConv significantly reduces the arithmetic operations required during CNN inference while maintaining model performance through a novel table lookup-based approximation.
Özet
The paper introduces TabConv, a novel approach to accelerate CNN inference by mapping key operations to table lookups. The key steps are:
- Converting convolution operations in the CNN model to matrix multiplications (MMs) using the im2col method.
- Mapping the resulting MMs to table lookups based on product quantization. This involves splitting the input and weight matrices into subspaces, learning prototypes for each subspace, and precomputing the dot products between prototypes and weights.
- Employing a priority masking strategy to selectively retain exact computations for critical layers, balancing the trade-off between accuracy and computation.
The authors evaluate TabConv on popular CNN models like ResNet-18, ResNet-34, and NetworkInNetwork (NIN) across CIFAR-10, CIFAR-100, and MNIST datasets. Key results:
- TabConv preserves over 93% of the original model's performance while reducing arithmetic operations by 36.5%, 25.8%, and 99.4% for ResNet-18 on CIFAR-10, CIFAR-100, and MNIST, respectively.
- It reduces operations by 35.6% and 99.3% for ResNet-34 on CIFAR-10 and MNIST, and 98.9% for NIN on MNIST.
- This significant reduction in computations comes at the cost of a 32-33x increase in storage on average to retain 90-80% of the original accuracy.
The priority masking technique is crucial in balancing the trade-off between accuracy and computation, as increasing the number of layers mapped to table lookups can lead to compounding errors.
İstatistikler
ResNet-18 on CIFAR-10 has 37.67 million FLOPs in the original model.
ResNet-18 on CIFAR-100 has 37.67 million FLOPs in the original model.
ResNet-18 on MNIST has 37.67 million FLOPs in the original model.
ResNet-34 on CIFAR-10 has 75.49 million FLOPs in the original model.
ResNet-34 on MNIST has 75.49 million FLOPs in the original model.
NIN on MNIST has 223.90 million FLOPs in the original model.
Alıntılar
"TabConv preserves over 93% of the original model's performance while reducing arithmetic operations by 36.5%, 25.8%, and 99.4% for ResNet-18 on CIFAR-10, CIFAR-100, and MNIST, respectively."
"It reduces operations by 35.6% and 99.3% for ResNet-34 on CIFAR-10 and MNIST, and 98.9% for NIN on MNIST."