المفاهيم الأساسية
A novel personalized federated learning approach that densely divides the base layers of the deep learning model and applies sequential scheduling methods to address data and class heterogeneity among clients.
الملخص
The paper proposes a personalized federated learning approach that addresses the heterogeneity of client data and class distributions. The key ideas are:
- Densely dividing the base layers of the deep learning model, beyond the traditional base and head components in representation learning.
- Introducing two scheduling methods, Vanilla and Anti, to sequentially unfreeze and train the densely divided base layers.
- Vanilla scheduling starts with the shallowest layers and progressively unfreezes deeper layers.
- Anti scheduling starts with the deepest layers and unfreezes shallower layers in reverse order.
- The scheduling approach reduces the need to communicate all base layers in the early training stages, leading to lower communication and computational costs.
- Experiments on datasets with high data and class heterogeneity (CIFAR-100, Tiny-ImageNet) show that the Anti scheduling approach outperforms other personalized federated learning algorithms in terms of accuracy, while the Vanilla scheduling method significantly reduces computational costs.
- The paper also provides an analysis of client-specific accuracy, computational costs, and the impact of layer unfreezing timing on the performance of the proposed algorithms.
الإحصائيات
The total number of parameters in the deep learning model is 582,026.
The computational cost per round for FedAvg is 873.039 billion FLOPs.
The computational cost per round for FedBABU is 865.344 billion FLOPs.
The computational cost per round for Vanilla Scheduling is 314.912 billion FLOPs.
The computational cost per round for Anti Scheduling is 838.880 billion FLOPs.
اقتباسات
"Our algorithm densely divides the base layer to address the heterogeneity of the client's data and class distribution, and it proposes two scheduling methods."
"The implementation of scheduling reduces the need to communicate all base layers in the early stages of training, thereby cutting down on communication and computational costs."
"In scenarios with both data and class heterogeneity, the Anti scheduling approach outperforms other algorithms in terms of accuracy, while the Vanilla scheduling method significantly reduces computational costs compared to other algorithms."