Conceitos Básicos
Federated Dual Prompt Tuning (Fed-DPT) is a novel federated learning approach that leverages prompt tuning techniques for both visual and textual inputs to address the challenges of domain shift and communication efficiency in federated learning.
Resumo
The paper introduces Federated Dual Prompt Tuning (Fed-DPT), a novel federated learning method that addresses the challenges of domain shift and communication efficiency.
Key highlights:
- Fed-DPT employs a pre-trained CLIP model and utilizes both visual and textual prompt tuning techniques to facilitate domain adaptation over decentralized data.
- It introduces domain-specific prompts and couples visual and textual representations through self-attention to tackle the challenge of domain shift across clients.
- The parameter-efficient prompt tuning approach significantly reduces communication costs compared to fine-tuning the entire model.
- Extensive experiments on domain adaptation benchmarks demonstrate the effectiveness of Fed-DPT, outperforming conventional federated learning methods and existing domain-agnostic CLIP-based approaches.
The paper first formulates the problem of domain-aware federated learning, where each client's local data originates from a different domain. It then details the Fed-DPT method, including the local training framework, parameter aggregation pipeline, and the use of momentum update to address sudden parameter changes.
The authors conduct thorough experiments on three domain adaptation datasets - DomainNet, OfficeHome, and PACS. Fed-DPT consistently achieves superior performance compared to baselines, improving the average accuracy on DomainNet by 14.8% over the original CLIP model. The paper also includes ablation studies to analyze the contributions of different components of Fed-DPT.
Estatísticas
Fed-DPT attains 68.4% average accuracy over six domains in the DomainNet dataset, which improves the original CLIP by a large margin of 14.8%.
In the OfficeHome dataset, Fed-DPT improves the zero-shot CLIP by 4.3% average accuracy and 0.3% standard deviation over four domains.
On the PACS dataset, Fed-DPT achieves 97.2% average accuracy, outperforming the zero-shot CLIP by 1.4%.
Citações
"Remarkably, we obtain a 68.4% average accuracy over six domains in the DomainNet dataset, outperforming the original CLIP model by 14.8%."
"Compared to conventional federated learning methods like FedAvg and FedProx, and existing domain-agnostic CLIP-based approaches such as PromptFL and FedCLIP, our Fed-DPT consistently achieves superior performance on three benchmarks."