Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning of Language Models
Parameter-efficient fine-tuning (PEFT) strategies for language models are more susceptible to weight-poisoning backdoor attacks compared to full-parameter fine-tuning. A Poisoned Sample Identification Module (PSIM) leveraging PEFT can effectively detect and mitigate the impact of poisoned samples, defending against such attacks.