This paper proposes a knowledge-enhanced visual-language pretraining (KEP) approach for computational pathology. The key contributions are:
Construction of a comprehensive Pathology Knowledge Tree (PathKT) that integrates 50,470 informative attributes of 4,718 diseases from 32 human tissues.
Development of a knowledge encoder that projects the structured pathology knowledge into a latent embedding space, where the attributes of the same disease are closely aligned.
Incorporation of the pretrained knowledge encoder to guide the visual-language pretraining, where the pathology knowledge is continuously injected into the image-text embedding space.
The authors conduct thorough experiments on three downstream tasks: retrieval, zero-shot patch classification, and zero-shot whole slide image tumor subtyping. The results demonstrate that the knowledge-enhanced pretraining can significantly improve the performance across different tasks, compared to existing data-driven visual-language pretraining approaches.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문