The paper presents a novel method called ToNNO (Tomographic Reconstruction of a Neural Network's Output) for weakly supervised segmentation of 3D medical images. The key ideas are:
Train a 2D classifier to distinguish between slices of positive and negative 3D volumes (i.e., volumes containing the region of interest vs. not). This introduces label noise, as all slices of a positive volume are labeled as positive even if they don't contain the region of interest.
Apply the inverse Radon transform to the logits produced by the 2D classifier on slices extracted at different angles from the 3D volume. This allows reconstructing a 3D heatmap that represents the classifier's predictions.
The authors also propose two variants called Averaged CAM and Tomographic CAM, which combine the 2D classifier with class activation mapping (CAM) methods like GradCAM and LayerCAM.
The method is evaluated on four large-scale 3D medical image datasets for tasks like tumor, lesion, and COVID-19 lesion segmentation. ToNNO and the proposed CAM variants outperform standard 2D CAM methods in most cases, achieving better F1-scores, dice scores, and balanced accuracy.
The key advantages of the approach are: 1) it can leverage 2D image encoders and their pre-trained weights, 2) it produces high-resolution 3D segmentation heatmaps without requiring any ground truth segmentation masks, and 3) the Tomographic CAM variant combines the strengths of CAM and tomographic reconstruction.
Egy másik nyelvre
a forrásanyagból
arxiv.org
Mélyebb kérdések