The paper presents a novel self-explaining neural architecture, called Representative Concept Extraction (RCE), that aims to address two key limitations of existing concept learning approaches: lack of concept fidelity and limited concept interoperability.
The key components of the proposed framework are:
Salient Concept Selection Network: This network selects the most representative concepts that are responsible for the model's predictions.
Self-Supervised Contrastive Concept Learning (CCL): This component utilizes self-supervised contrastive learning to learn domain-invariant concepts, improving concept interoperability.
Prototype-based Concept Grounding (PCG): This regularizer ensures that the learned concepts are aligned across domains, mitigating the problem of concept shift.
The authors evaluate the proposed approach on four real-world datasets spanning various domains, including digits, objects, and vehicles. The results demonstrate that the RCE framework with CCL and PCG components outperforms existing self-explaining approaches in terms of both concept fidelity and concept interoperability, as measured by domain adaptation performance.
The qualitative analysis further shows that the proposed method learns domain-aligned concepts and can effectively explain predictions using the most relevant prototypes from the training set.
翻译成其他语言
从原文生成
arxiv.org
更深入的查询