The paper presents a novel self-explaining neural architecture, called Representative Concept Extraction (RCE), that aims to address two key limitations of existing concept learning approaches: lack of concept fidelity and limited concept interoperability.
The key components of the proposed framework are:
Salient Concept Selection Network: This network selects the most representative concepts that are responsible for the model's predictions.
Self-Supervised Contrastive Concept Learning (CCL): This component utilizes self-supervised contrastive learning to learn domain-invariant concepts, improving concept interoperability.
Prototype-based Concept Grounding (PCG): This regularizer ensures that the learned concepts are aligned across domains, mitigating the problem of concept shift.
The authors evaluate the proposed approach on four real-world datasets spanning various domains, including digits, objects, and vehicles. The results demonstrate that the RCE framework with CCL and PCG components outperforms existing self-explaining approaches in terms of both concept fidelity and concept interoperability, as measured by domain adaptation performance.
The qualitative analysis further shows that the proposed method learns domain-aligned concepts and can effectively explain predictions using the most relevant prototypes from the training set.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Sanchit Sinh... a las arxiv.org 05-02-2024
https://arxiv.org/pdf/2405.00349.pdfConsultas más profundas