Yang, Y., Huang, L., Chen, S., Ma, K., & Wei, Y. (2024). Learning Where to Edit Vision Transformers. Advances in Neural Information Processing Systems, 38.
This paper addresses the challenge of efficiently editing pre-trained ViTs for object recognition to correct their predictive errors without requiring full retraining. The authors aim to achieve this by focusing on the "where-to-edit" problem, identifying key model parameters for modification.
The authors propose a locate-then-edit approach, employing a meta-learning framework to train a hypernetwork that identifies critical parameters for editing. This hypernetwork is trained on pseudo-samples generated using the CutMix data augmentation technique, simulating real-world failure scenarios. During testing, the identified parameters are fine-tuned using gradient descent to achieve targeted edits.
The authors demonstrate the effectiveness of their method in editing ViTs for object recognition, achieving superior performance compared to existing techniques. Their approach offers a promising solution for efficiently adapting pre-trained ViTs to new data and correcting specific prediction errors without compromising overall model performance.
This research contributes significantly to the field of model editing in computer vision, particularly for ViT architectures. The proposed method offers a practical and efficient solution for adapting pre-trained ViTs, potentially reducing the need for costly retraining and enabling wider adoption of these powerful models in real-world applications.
While the CutMix-based pseudo-sample generation proves effective, further investigation into optimal synthetic data generation techniques for model editing is warranted. Additionally, extending the method to other vision architectures and exploring its application in batch-editing scenarios are promising avenues for future research.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Yunqiao Yang... at arxiv.org 11-05-2024
https://arxiv.org/pdf/2411.01948.pdfDeeper Inquiries