Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction
الإحصائيات
TCGA Lung Cancer dataset: 541枚のLUADケースから478枚のスライドおよび512枚のLUSCケースから478枚のスライド。
Camelyon16 dataset: 399枚のH&E染色スライド画像。
FiVE zero-shot performance: Top-1 accuracy 65.23% and Top-5 accuracy 95.18%.
FiVE few-shot classification performance: One-shot experiment improvement of 12.90%.
FiVE comparison with existing works: Outperforms all baselines by a significant margin.
اقتباسات
"Recently, Vision-Language Models (VLMs) have demonstrated remarkable performance in WSI classification."
"Our method demonstrates robust generalizability and strong transferability, dominantly outperforming the counterparts on the TCGA Lung Cancer dataset."
"We pioneer the utilization of the available WSI diagnostic reports with fine-grained guidance."