Core Concepts
EncodeNet, a novel framework, enhances the accuracy of baseline DNN models by leveraging a Generalized Converting Autoencoder for representative feature learning and knowledge transfer.
Abstract
The EncodeNet framework consists of three key components:
-
Generalized Converting Autoencoder Design:
- Designs a customized autoencoder by using the feature extraction layers of a baseline DNN as the encoder and creating a complementary decoder.
- This allows the autoencoder to capture and represent crucial features from the input data effectively.
-
Representative Feature Learning with Converting Autoencoder:
- Introduces intraclass clustering to group similar images within each class, enabling the Converting Autoencoder to perform more effective representative image transformation.
- Selects the most representative image for each cluster based on the entropy of the baseline DNN's classification output.
- Trains the Converting Autoencoder to transform input images into their corresponding representative images within the same class and cluster.
-
Knowledge Transfer from Converting Autoencoder for Image Classification:
- Detaches the trained encoder layers from the Converting Autoencoder and couples them with additional layers derived from the classification part of the baseline DNN.
- Freezes the pre-trained encoder layers and only trains the remaining layers, leveraging the learned representations from the autoencoder and fine-tuning them for image classification.
The experimental results on the CIFAR-10 and CIFAR-100 datasets demonstrate that EncodeNet can significantly improve the accuracy of baseline DNN models, such as VGG and ResNet, without increasing the model size. EncodeNet outperforms state-of-the-art techniques based on knowledge distillation and attention mechanisms.
Stats
EncodeNet improves the accuracy of VGG16 from 92.64% to 94.05% on CIFAR-10, and ResNet20 from 74.56% to 76.04% on CIFAR-100.
EncodeNet achieves higher accuracy compared to knowledge distillation techniques like KD, RKD, FitNet, and FT on both ResNet and VGG networks.
EncodeNet enhances the accuracy of ResNet50 on CIFAR-100 from 77.23% to 80.1%, outperforming attention mechanism-based techniques like SE, BAM, and CBAM, while maintaining a relatively small model size.
Quotes
"EncodeNet, a novel integrative framework, enhances the accuracy of any baseline DNN with a modular architecture of feature extraction layers followed by classification layers, achieving performance on par with significantly larger models."
"Our framework surpasses competing techniques, including state-of-the-art knowledge distillation and attention mechanism-based methods."