The paper introduces PMG-AFT, a novel method for improving the zero-shot adversarial robustness of vision-language models. By incorporating constraints from pre-trained models and clean examples, the proposed method outperforms existing techniques in terms of both robust accuracy and clean accuracy. Extensive experiments on various datasets demonstrate the effectiveness of PMG-AFT in enhancing model generalizability and adversarial robustness.
Large-scale vision-language models like CLIP have shown impressive performance but are vulnerable to imperceptible adversarial examples. Existing defense methods focus on adversarial training, but direct application may lead to overfitting. The proposed PMG-AFT method leverages supervision from pre-trained models to enhance zero-shot adversarial robustness.
The study explores the trade-off between robust accuracy and clean accuracy in fine-tuning methods. PMG-AFT achieves a better balance between these metrics compared to other approaches. Additionally, experiments show that our method consistently outperforms state-of-the-art techniques across multiple datasets.
Na inny język
z treści źródłowej
arxiv.org
Głębsze pytania