The paper investigates the use of perceptual metrics, such as FID, KID, and SWD, for evaluating medical image translation tasks, and compares them to segmentation-based metrics. The authors evaluate two medical image translation tasks: (1) subtle intra-modality breast MRI translation and (2) more drastic inter-modality translation of lumbar spine MRI to CT.
The results show that perceptual metrics do not consistently align with common segmentation metrics for medical image translation. No single perceptual metric reliably correlates with segmentation metrics for both tasks, and the commonly used FID is especially inconsistent. The authors advise caution in using FID for evaluating medical image translation.
The pixel-level SWD metric shows better correlation than the learned feature metrics (FID, KID, IS) for the subtle intra-modality breast MRI translation, but fails for the more complex inter-modality MRI-to-CT translation. This suggests that perceptual metrics designed for assessing image realism may not be fully suitable for medical image translation, which requires preserving anatomical and semantic content.
The authors conclude that a broader evaluation approach and research into more universally applicable metrics are needed in the field of medical image translation.
לשפה אחרת
מתוכן המקור
arxiv.org
תובנות מפתח מזוקקות מ:
by Nicholas Kon... ב- arxiv.org 04-12-2024
https://arxiv.org/pdf/2404.07318.pdfשאלות מעמיקות