Conceitos Básicos
Integrating semantic segmentation masks into multimodal large language models (MLLMs) enhances the accuracy and detail of AI-generated radiology reports for chest X-rays.
Estatísticas
MAIRA-Seg outperforms non-segmentation baselines in clinical metrics, including RadCliQ, Macro F1-MR, Micro F1-MR, and Radfact/logical_f1.
MAIRA-Seg-Frontal shows significant improvements over MAIRA-Frontal in all five mask-relevant pathological findings: support devices, lung opacity, cardiomegaly, pleural effusion, and pneumothorax.
MAIRA-Seg-Multi demonstrates significant gains over MAIRA-Multi for most relevant pathological findings, including support devices, cardiomegaly, and pleural effusion.
Citações
"We hypothesize that providing localized pixel-level details alongside images can enhance MLLM’s perceptual and reasoning abilities for biomedical applications like radiology report generation."
"By integrating pixel-level knowledge in the form of segmentation and mask-aware information into the prompt instructions of the MLLM, we aim to improve the pixel-wise visual understanding and enhance the quality and accuracy of draft radiology reports generated from CXRs."
"The results confirm that using segmentation masks enhances the nuanced reasoning of MLLMs, potentially contributing to better clinical outcomes."