toplogo
Войти

Efficient Active Few-Shot Learning for Histopathology Image Classification with Limited Annotation Budget


Основные понятия
Myriad Active Learning (MAL) framework efficiently utilizes unlabelled histopathology data and a limited annotation budget to achieve high classification accuracy, outperforming prior active learning and few-shot learning methods.
Аннотация
The paper proposes a novel active few-shot learning framework called Myriad Active Learning (MAL) to address the challenge of high annotation cost in digital histopathology. Key highlights: MAL incorporates self-supervised learning, pseudo-label generation, and active learning in a positive feedback loop to effectively leverage the abundant unlabelled data. MAL defines a new uncertainty-based sample selection strategy that combines existing methods and utilizes the entire uncertainty list to reduce sample redundancy. Extensive experiments on two public histopathology datasets show that MAL achieves superior test accuracy, macro F1-score, and label efficiency compared to prior active learning and few-shot learning methods. MAL can achieve comparable test accuracy to a fully supervised model while labelling only 5% of the dataset, demonstrating its potential for effective label-efficient learning in histopathology.
Статистика
MAL can achieve 95.9% test accuracy on the NCT dataset using only 5% of the labelled data, compared to 96.16% for a fully supervised model. On the BreaKHis dataset, MAL improves the macro F1-score by 4.3%, 14.1%, and 21.5% at 1-shot, 5-shot, and 10-shot settings respectively, compared to prior active learning methods.
Цитаты
"MAL substantially improves the test accuracy and macro F1 score of the model, and significantly outperforms FHIST [a recent few-shot learning benchmark] in all 10-shot cases." "MAL is able to achieve similar performance with only 5% annotation by always selecting the most informative samples to supplement model learning."

Ключевые выводы из

by Nico Schiavo... в arxiv.org 04-26-2024

https://arxiv.org/pdf/2310.16161.pdf
MyriadAL: Active Few Shot Learning for Histopathology

Дополнительные вопросы

How can the pseudo-label generation in MAL be further improved to better leverage the unlabelled data?

In order to enhance the pseudo-label generation in MAL, several strategies can be considered. Firstly, incorporating ensemble methods to generate pseudo-labels can provide more robust and diverse predictions. By aggregating predictions from multiple models trained on different subsets of the data or using different architectures, the pseudo-labels can capture a broader range of patterns and uncertainties present in the unlabelled data. Additionally, implementing a confidence threshold for pseudo-label assignment can help filter out uncertain predictions. By setting a threshold based on the model's confidence in its predictions, only samples with high-confidence pseudo-labels will be selected for annotation, reducing the risk of noisy or incorrect labels. Furthermore, exploring semi-supervised learning techniques, such as consistency regularization, can improve the quality of pseudo-labels. By enforcing consistency between predictions made by the model on augmented versions of the same unlabelled sample, the pseudo-labels can be refined to better represent the underlying data distribution.

How can MAL be extended to handle other medical imaging modalities beyond histopathology, and what additional challenges would need to be addressed?

To extend MAL to handle other medical imaging modalities beyond histopathology, several adaptations and considerations need to be taken into account. Firstly, the pretraining phase with SSL should be tailored to the specific characteristics of the new imaging modality. Different modalities may require different data augmentations, loss functions, or network architectures to learn meaningful representations effectively. Moreover, the active learning query strategy in MAL may need to be adjusted to account for the unique features of the new imaging modality. For instance, in radiology images, the presence of anatomical structures or specific pathologies may influence the selection of informative samples for annotation. Additionally, the annotation process in other medical imaging modalities may involve domain-specific expertise or guidelines. Integrating domain knowledge into the active learning loop and pseudo-label generation can help ensure the quality and relevance of the annotations. Challenges that may arise when extending MAL to other medical imaging modalities include variability in image quality, resolution, and interpretation, as well as the presence of artifacts or noise specific to each modality. Adapting the framework to handle these challenges and ensuring robust performance across diverse medical imaging domains will be crucial for successful implementation.

What other uncertainty-based sampling strategies could be explored to enhance the diversity and informativeness of the selected samples?

In addition to margin-entropy sampling, several other uncertainty-based sampling strategies can be explored to enhance the diversity and informativeness of the selected samples in MAL: BALD (Bayesian Active Learning by Disagreement): This method measures the model's uncertainty by considering the variance in predictions across different Bayesian models. Samples that lead to the most disagreement among the models are selected for annotation. Expected Model Change: This strategy evaluates how much the model's predictions would change if a particular sample were added to the training set. Samples that are expected to have the most impact on the model's performance are prioritized for annotation. Query by Committee: In this approach, multiple models or committee members are trained, and samples are selected based on the level of disagreement or consensus among the committee members. Samples that lead to the most disagreement are chosen for annotation. Core-set Selection: This method aims to select samples that cover the entire data distribution or decision boundary of the model. By identifying core-set samples that are representative of the dataset, the diversity of the selected samples can be improved. Exploring these alternative uncertainty-based sampling strategies can provide a more comprehensive and diverse selection of samples for annotation, leading to improved model performance and generalization in active few-shot learning scenarios.
0
visual_icon
generate_icon
translate_icon
scholar_search_icon
star