FEDPIT introduces a novel approach to enhance federated few-shot performance while preserving privacy. The method utilizes parameter-isolated training and self-generated synthetic data to improve model performance and defend against training data extraction attacks. Extensive experiments on real-world medical data demonstrate the effectiveness of FEDPIT.
Instruction tuning is crucial for large language models (LLMs) to generate human-aligned responses. However, collecting diverse, high-quality instruction data poses challenges, especially in privacy-sensitive domains. Federated instruction tuning (FEDIT) leverages federated learning for collaborative training from multiple data owners while preserving privacy. Challenges include limited instruction data and vulnerabilities to training data extraction attacks.
To address these issues, FEDPIT proposes a novel federated algorithm that utilizes LLMs' in-context learning capability to self-generate task-specific synthetic data for training autonomously. The method employs parameter-isolated training to maintain global parameters trained on synthetic data and local parameters trained on augmented local data, effectively thwarting data extraction attacks.
Extensive experiments on real-world medical data demonstrate the effectiveness of FEDPIT in improving federated few-shot performance while preserving privacy and robustness against data heterogeneity.
На другой язык
из исходного контента
arxiv.org
Дополнительные вопросы