Core Concepts
Improved generalization bounds for federated learning, especially in non-iid settings, can be achieved by employing varying aggregation frequencies for the representation extractor and the head of the model.
Abstract
The paper focuses on reducing the communication cost of federated learning by exploring generalization bounds and representation learning.
Key highlights:
Derived a tighter generalization bound for one-round federated learning based on local clients' generalizations and data distribution heterogeneity (non-iid scenario).
Characterized a generalization bound for R-round federated learning and its relation to the number of local updates (local stochastic gradient descents).
Showed that less frequent aggregations, hence more local updates, for the representation extractor (usually corresponding to initial layers) leads to the creation of more generalizable models, particularly for non-iid scenarios.
Designed a novel Federated Learning with Adaptive Local Steps (FedALS) algorithm based on the generalization bound analysis and representation learning interpretation. FedALS employs varying aggregation frequencies for different parts of the model to reduce communication cost.
Experimental results on image classification and language modeling tasks demonstrated the effectiveness of FedALS in non-iid settings, outperforming baselines in terms of accuracy while also reducing communication costs.
Stats
The paper does not provide any specific numerical data or statistics. The focus is on the theoretical analysis and algorithm design.