Mummani, N., Ketha, S., & Ramaswamy, V. (2024). Peter Parker or Spiderman? Disambiguating Multiple Class Labels. In 38th Conference on Neural Information Processing Systems (NeurIPS 2024). ATTRIB Workshop. arXiv:2410.19479v1 [cs.CV] 25 Oct 2024.
This paper addresses the challenge of interpreting multiple class label predictions in deep learning image recognition models, specifically aiming to determine whether a pair of predicted labels represents distinct entities within an image or multiple guesses about a single entity.
The authors propose a framework based on counterfactual proofs, utilizing modern segmentation and input attribution techniques. They employ integrated gradients for pixel-wise attribution and the Segment Anything Model (SAM) for image segmentation. By analyzing segment-wise attribution scores, they define and identify two types of label predictions: δ-disjoint (distinct entities) and δ-overlapping (single entity). They propose algorithms to generate redacted images as counterfactual proofs, demonstrating the impact of removing specific segments on label predictions.
The proposed method effectively differentiates between δ-disjoint and δ-overlapping label predictions, providing verifiable counterfactual proofs in the form of redacted images. The authors demonstrate the effectiveness of their approach on various image classification models (VGG-16, Inception-v3, ResNet-50) using the ImageNet dataset.
The research presents a novel framework for disambiguating multiple class label predictions in deep learning image recognition, enhancing the interpretability and reliability of model predictions. The use of counterfactual proofs offers a verifiable and objective method for analyzing input attributions.
This work contributes to the growing field of interpretable AI by providing a practical approach to understanding and verifying multiple label predictions in image recognition, which has implications for various applications requiring reliable model interpretations.
The study acknowledges limitations related to the performance of existing attribution and segmentation algorithms. Future research could explore alternative attribution methods and address challenges posed by images with absent objects or labels with very small softmax values. Further investigation into the generalizability of the framework to other domains beyond image recognition is also suggested.
Na inny język
z treści źródłowej
arxiv.org
Głębsze pytania