The study introduces PRIME, a novel approach that prioritizes interpretability in identifying failure modes in image classification models. Existing methods may struggle to provide coherent descriptions due to reliance on clustering in the feature space. PRIME focuses on obtaining human-understandable tags for images and analyzing model behavior based on tag combinations. By ensuring minimal and non-redundant tag sets, PRIME successfully identifies failure modes and generates high-quality text descriptions. The method's effectiveness is demonstrated through experiments on various datasets, highlighting the importance of interpretability in understanding model failures.
他の言語に翻訳
原文コンテンツから
arxiv.org
深掘り質問