PeerAiD introduces a unique method for adversarial distillation by training a peer network alongside the student network to enhance robustness. The peer network surpasses pretrained models in defending against student-generated adversarial examples, leading to improved overall performance.
Previous works have focused on using pretrained teachers for guidance, but PeerAiD's approach of peer tutoring shows superior results in boosting robust accuracy. The method demonstrates better generalization and defense against transfer-based attacks, highlighting its effectiveness in enhancing model resilience.
PeerAiD's loss landscape visualization reveals a flatter landscape compared to baselines, indicating improved generalization ability. Additionally, feature representation and semantic gradients analysis show that the peer model provides reliable guidance while not being robust against self-attacks.
The experimental results showcase PeerAiD's success across different datasets and models, emphasizing its potential for improving adversarial robustness in machine learning applications.
Till ett annat språk
från källinnehåll
arxiv.org
Djupare frågor