This paper proposes a knowledge-enhanced visual-language pretraining (KEP) approach for computational pathology. The key contributions are:
Construction of a comprehensive Pathology Knowledge Tree (PathKT) that integrates 50,470 informative attributes of 4,718 diseases from 32 human tissues.
Development of a knowledge encoder that projects the structured pathology knowledge into a latent embedding space, where the attributes of the same disease are closely aligned.
Incorporation of the pretrained knowledge encoder to guide the visual-language pretraining, where the pathology knowledge is continuously injected into the image-text embedding space.
The authors conduct thorough experiments on three downstream tasks: retrieval, zero-shot patch classification, and zero-shot whole slide image tumor subtyping. The results demonstrate that the knowledge-enhanced pretraining can significantly improve the performance across different tasks, compared to existing data-driven visual-language pretraining approaches.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Xiao Zhou,Xi... lúc arxiv.org 04-16-2024
https://arxiv.org/pdf/2404.09942.pdfYêu cầu sâu hơn