Core Concepts
Weakly-supervised object localization (WSOL) models can be adapted using source-free domain adaptation (SFDA) methods to address domain shifts in histology images, but this raises challenges in optimizing both classification and localization performance.
Abstract
The paper focuses on evaluating the effectiveness of four representative SFDA methods for adapting WSOL models to histology images. The SFDA methods compared are SFDA-Distribution Estimation (SFDA-DE), Source HypOthesis Transfer (SHOT), Cross-Domain Contrastive Learning (CDCL), and Adaptively Domain Statistics Alignment (AdaDSA).
The experiments are conducted on two histology datasets, GlaS (smaller, breast cancer) and Camelyon16 (larger, colon cancer), to assess the SFDA methods in terms of classification and localization accuracy. The key findings are:
SFDA can be very challenging and limited for larger datasets like Camelyon16, with localization performance remaining a challenge as SFDA methods are designed to optimize for discriminant classification.
Selecting the best localization (B-LOC) model for the source network does not necessarily lead to improved localization after adaptation, as there is a trade-off between optimizing for classification and localization.
The accuracy of pseudo-labels used by methods like SFDA-DE and CDCL is crucial for the adaptation process, and errors in the early stages can significantly impact the final performance.
The entropy loss used in SHOT helps to smooth out the impact of unreliable pseudo-labels, making it more robust compared to other methods.
Overall, the results highlight the challenges in balancing classification and localization performance when adapting WSOL models using SFDA methods for histology images.
Stats
"Given the emergence of deep learning, digital pathology has gained popularity for cancer diagnosis based on histology images."
"Recent methods from the machine learning (ML) and computer vision communities can assist the pathologist in the diagnosis of cancers based on histology images."
"Whole slide images (WSIs) are captured at a very high resolution (over 200 million pixels)."
"Extracting pixel-level annotations for supervised training of a segmentation model is costly and time-consuming."
Quotes
"WSOL models can provide spatial visualization linked to a classifier's predictions after training on images sampled from WSIs annotated with inexpensive image-class labels."
"SFDA is challenging since labeled source data cannot be used during the adaptation process."
"Despite substantial improvements, these methods may still highlight background regions."