핵심 개념
Federated contrastive learning can be formulated as maximizing a lower bound to the global mutual information between representations of two views of the data, which leads to principled extensions of SimCLR to the federated setting for both unsupervised and semi-supervised learning.
초록
The paper investigates contrastive learning in the federated setting through the lens of SimCLR and multi-view mutual information (MI) maximization. It uncovers a connection between contrastive representation learning and user verification, where adding a user verification loss to each client's local SimCLR loss recovers a lower bound to the global multi-view MI.
For the unsupervised case:
- The local SimCLR objective corresponds to maximizing a lower bound to the client-conditional MI between the two views.
- To maximize the global MI, an additional user verification (UV) loss is required for each view.
- The nature of non-i.i.d.-ness (label skew, covariate shift, joint shift) impacts whether the global or local objective is more beneficial for downstream task performance.
For the semi-supervised case:
- A label-dependent lower bound for the local SimCLR is derived, which encourages clustering according to the label through additional classification losses.
- This label-dependent bound can be extended to the federated setting by adding the UV losses.
The proposed methods are evaluated on CIFAR-10 and CIFAR-100 datasets, demonstrating the effectiveness of the federated contrastive learning approach compared to local SimCLR, especially in the presence of label skew non-i.i.d.-ness. The theoretical insights and model design are also shown to generalize to other pretraining methods like spectral contrastive learning and SimSiam.
통계
The concentration parameter α for the Dirichlet distribution controlling the label skew is 0.1 for both CIFAR-10 and CIFAR-100.
For the covariate shift setting, a rotated version of CIFAR-10 and CIFAR-100 is used.
The joint shift case combines both label skew and covariate shift.
The number of clients is 100 for CIFAR-10 and 500 for CIFAR-100.
인용구
"We see that the multi-view MI in the federated setting decomposes into three terms; we want to maximize the average, over the clients, local MI between the representations of the two views z1, z2, along with the MI between the representation z1 and the client ID s while simultaneously minimizing the additional information z1 carries about s conditioned on z2."
"By combining our results, we arrive at the following lower bound for the global MI that decomposes into a sum of local objectives involving the parameters θ, ϕ. We dub it as Federated SimCLR."