The paper presents a novel self-explaining neural architecture, called Representative Concept Extraction (RCE), that aims to address two key limitations of existing concept learning approaches: lack of concept fidelity and limited concept interoperability.
The key components of the proposed framework are:
Salient Concept Selection Network: This network selects the most representative concepts that are responsible for the model's predictions.
Self-Supervised Contrastive Concept Learning (CCL): This component utilizes self-supervised contrastive learning to learn domain-invariant concepts, improving concept interoperability.
Prototype-based Concept Grounding (PCG): This regularizer ensures that the learned concepts are aligned across domains, mitigating the problem of concept shift.
The authors evaluate the proposed approach on four real-world datasets spanning various domains, including digits, objects, and vehicles. The results demonstrate that the RCE framework with CCL and PCG components outperforms existing self-explaining approaches in terms of both concept fidelity and concept interoperability, as measured by domain adaptation performance.
The qualitative analysis further shows that the proposed method learns domain-aligned concepts and can effectively explain predictions using the most relevant prototypes from the training set.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Sanchit Sinh... at arxiv.org 05-02-2024
https://arxiv.org/pdf/2405.00349.pdfDeeper Inquiries