Core Concepts
The core message of this work is to propose a novel evaluation metric, Salience-guided Faithfulness Coefficient (SaCo), that assesses how faithfully post-hoc explanation methods for Vision Transformers align the assigned salience scores with the actual influence of input pixels on the model's predictions.
Abstract
The authors address the problem of evaluating the faithfulness of post-hoc explanations for Vision Transformer models. They argue that existing evaluation metrics overlook the core assumption of faithfulness, which is that the magnitude of salience scores should reflect the level of anticipated impacts on the model's predictions.
To address this gap, the authors propose a novel evaluation metric called Salience-guided Faithfulness Coefficient (SaCo). SaCo operates by conducting pairwise comparisons among distinct pixel groups based on their salience scores and quantifying the differences in their actual impacts on the model's confidence. This allows SaCo to directly evaluate the alignment between the salience scores and the model's behavior.
The authors conduct extensive experiments across various datasets and Vision Transformer models. The results demonstrate that SaCo can effectively differentiate meaningful explanations from Random Attribution, which existing metrics struggle to do. Furthermore, the authors provide insights into the key factors that affect the faithfulness of attention-based explanation methods, highlighting the importance of incorporating gradient information and cross-layer aggregation.
Overall, the proposed SaCo offers a comprehensive and robust evaluation of the faithfulness property, which is crucial for interpreting the reasoning process of Vision Transformer models.