Core Concepts
This work proposes a novel benchmark for evaluating continual learning methods in the context of multi-label medical image classification, combining the challenges of new class arrivals and domain shifts. To address these challenges, the authors introduce Pseudo-Label Replay, a method that integrates Pseudo-Labeling and Replay techniques to effectively handle new classes and domain shifts.
Abstract
The paper presents a novel benchmark for evaluating continual learning (CL) methods in the context of multi-label medical image classification. The benchmark, termed New Instances & New Classes (NIC), combines the challenges of new class arrivals and domain shifts, reflecting the realistic nature of CL in the medical imaging domain.
The authors first motivate the need for such a benchmark, highlighting the importance of model flexibility and scalability in accommodating new data and expanding diagnostic capabilities as medical knowledge progresses. They then introduce the NIC scenario, which consists of a sequence of seven tasks with a total of 19 classes across two domains (the NIH Clinical Center and the Stanford Hospital).
To address the unique challenges posed by the NIC scenario, the authors propose a novel approach called Pseudo-Label Replay. This method leverages Pseudo-Labeling and Replay techniques to integrate information from previous tasks while adapting to new data streams. The key advantages of Pseudo-Label Replay are:
The Replay is optimized, as the targets give information not only on the tasks they were taken from but on all tasks up to the current one.
Task interference is reduced compared to traditional Replay approaches.
The limitations of distillation-based methods, such as the need for old classes to reappear in future tasks, are overcome by Replaying samples that contain old classes.
The authors evaluate the performance of Pseudo-Label Replay and several existing CL methods on the proposed benchmark. The experimental results demonstrate the superiority of Pseudo-Label Replay, which outperforms the other approaches in terms of both the average F1 score and the forgetting metric. The authors also provide a detailed analysis of the forgetting behavior exhibited by each method.
Overall, this work makes significant contributions by (1) devising a novel benchmark for CL in medical imaging, (2) proposing a novel method called Pseudo-Label Replay to address the challenges of the NIC scenario, and (3) providing a comprehensive evaluation of the proposed approach and existing CL methods on the benchmark.
Stats
Chest X-ray images from the NIH Clinical Center and the Stanford Hospital contain information on 19 classes across 7 tasks.
The dataset models a realistic CL scenario with new classes and domain shifts.
Quotes
"Multi-label image classification in dynamic environments is a problem that poses significant challenges."
"Unlike traditional scenarios, it reflects the realistic nature of CL in domains such as medical imaging, where updates may introduce both new classes and changes in domain characteristics."