The paper presents a novel self-explaining neural architecture, called Representative Concept Extraction (RCE), that aims to address two key limitations of existing concept learning approaches: lack of concept fidelity and limited concept interoperability.
The key components of the proposed framework are:
Salient Concept Selection Network: This network selects the most representative concepts that are responsible for the model's predictions.
Self-Supervised Contrastive Concept Learning (CCL): This component utilizes self-supervised contrastive learning to learn domain-invariant concepts, improving concept interoperability.
Prototype-based Concept Grounding (PCG): This regularizer ensures that the learned concepts are aligned across domains, mitigating the problem of concept shift.
The authors evaluate the proposed approach on four real-world datasets spanning various domains, including digits, objects, and vehicles. The results demonstrate that the RCE framework with CCL and PCG components outperforms existing self-explaining approaches in terms of both concept fidelity and concept interoperability, as measured by domain adaptation performance.
The qualitative analysis further shows that the proposed method learns domain-aligned concepts and can effectively explain predictions using the most relevant prototypes from the training set.
Til et annet språk
fra kildeinnhold
arxiv.org
Viktige innsikter hentet fra
by Sanchit Sinh... klokken arxiv.org 05-02-2024
https://arxiv.org/pdf/2405.00349.pdfDypere Spørsmål