核心概念
A weakly supervised multiple instance learning approach was explored to accurately predict cancer phenotypes and identify associated cellular morphologies in whole slide images at different magnification levels.
摘要
The paper explores the use of multiple instance learning (MIL) approaches, specifically Attention MIL (AMIL) and Additive MIL (AdMIL), for two key tasks in digital pathology: tumor detection and gene mutation identification.
For the tumor detection task at 5x magnification, the models achieved high performance, with AMIL obtaining the best AUC of 0.971. The heatmaps generated by the models highlighted relevant regions of the whole slide images, indicating their ability to identify tumor areas.
For the gene mutation detection task, the performance was more varied across different magnification levels. At 5x magnification, the models struggled to identify meaningful patterns, likely due to the lack of cellular-level details at this scale. However, at 10x and 20x magnifications, the models, especially AMIL, showed better performance, with AUCs of 0.711 and 0.704 respectively. The heatmaps generated at these higher magnifications provided more insights into the morphological features associated with the TP53 gene mutation.
The authors also explored a modified version of AdMIL, which used the attention mechanism from AMIL. This model performed better than the original AdMIL at the 20x magnification level for the gene mutation task, suggesting that the attention mechanism plays a crucial role in identifying relevant regions of interest.
The results highlight the importance of considering different magnification levels when analyzing whole slide images, as the models were able to capture distinct morphological features associated with tumor presence and gene mutations at different scales. The study demonstrates the potential of weakly supervised MIL approaches for efficient and interpretable analysis of digital pathology data.
統計資料
Whole slide images from The Cancer Genome Atlas (TCGA) were used for the two tasks:
Tumor detection task: 694 slides from the TCGA-LUSC (Lung Squamous Cell Carcinoma) dataset, with an equal number of positive and negative slides.
Gene mutation detection task: 662 slides from the TCGA-BRCA (Breast Invasive Carcinoma) dataset, with 331 positive and 331 negative slides for the TP53 gene mutation.
引述
"Whole Slide Images (WSI), obtained by high-resolution digital scanning of microscope slides at multiple scales, are the cornerstone of modern Digital Pathology."
"Deep learning has been particularly successful in medical imaging applications such as diagnosis, sub-type classification and prognosis."
"To provide some degree of interpretability, as well as better results, a variation of the attention mechanism can be used as a MIL pooling operator."