Vulnerability of Attribution Methods Using Pre-Softmax Scores: Altering Heatmaps Without Changing Model Outputs
Gradient-based attribution methods using pre-softmax scores are vulnerable to adversarial attacks that can modify the heatmaps produced without changing the model's final outputs.