The AAPM Grand Challenge on Deep Generative Modeling for Learning Medical Image Statistics was conducted to facilitate the development and evaluation of DGMs that can accurately reproduce key image statistics relevant to medical imaging applications. A common training dataset comprising 2D slices from a 3D virtual breast phantom was provided, and a standardized evaluation procedure was developed to assess the ability of submitted DGMs to generate ensembles of images that reproduce important morphological, textural, and intensity-derived features.
The challenge received 58 submissions from 12 unique participants. After a preliminary evaluation based on the Fréchet Inception Distance (FID) and a memorization metric, 9 submissions were eligible for the final ranking. The top-ranked submission employed a conditional latent diffusion model, while the joint runners-up used a generative adversarial network (GAN) followed by a superresolution network.
The evaluation revealed that the overall ranking of the top submissions did not always match the FID-based ranking, highlighting the importance of domain-specific assessments beyond ensemble-level metrics. Additional analyses identified various artifacts in the generated images, such as issues with ligament structures, tissue boundaries, and texture, which were common across multiple submissions. These findings underscored the need for comprehensive, application-relevant evaluations of DGMs for medical image synthesis.
The challenge demonstrated that the specification of a DGM may differ depending on its intended use, and that domain-specific assessments are crucial for further DGM design and deployment in medical imaging applications.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Rucha Deshpa... at arxiv.org 05-06-2024
https://arxiv.org/pdf/2405.01822.pdfDeeper Inquiries