Centrala begrepp
Diversification along the intersection region of adversarial trajectory enhances transferability in vision-language attacks.
Sammanfattning
The content discusses the vulnerability of vision-language pre-training models to multimodal adversarial examples and proposes a method to enhance transferability by diversifying adversarial examples. It introduces diversification along the intersection region of adversarial trajectory and text-guided adversarial example selection. Extensive experiments demonstrate the effectiveness of the proposed method across various VLP models and tasks.
- Introduction to Vision-Language Pre-training Models
- Challenges with Multimodal Adversarial Examples
- Proposed Method: Diversification along Intersection Region
- Text-Guided Adversarial Example Selection
- Experimental Results and Effectiveness
Statistik
A recent work indicates that augmenting image-text pairs increases AE diversity significantly.
The proposed method aims to expand AE diversity by diversifying along the intersection region.
Extensive experiments affirm the effectiveness of the proposed method in improving transferability.
Citat
"Strengthens adversarial attacks and uncovers vulnerabilities in VLP models."
"Diversification along the intersection region expands AE diversity significantly."