セグメンテーションマスクをマルチモーダル大規模言語モデル(MLLM)に統合することで、胸部X線画像の解釈能力が向上し、より正確で詳細な放射線レポートの生成が可能になる。
Integrating semantic segmentation masks into multimodal large language models (MLLMs) enhances the accuracy and detail of AI-generated radiology reports for chest X-rays.
Integrating anatomical and pathological information at various scales using pathology-aware regional prompts significantly improves the accuracy and clinical relevance of AI-generated radiology reports.
Integrating best practices from various studies, this paper outlines a robust radiology report generation system that leverages deep learning and multimodal learning to improve the accuracy, efficiency, and interpretability of automated radiology reporting.
R2Gen-Mamba offers a more efficient approach to automatic radiology report generation by combining the Mamba model's efficient sequence processing with the contextual understanding of Transformer architectures, resulting in high-quality reports with reduced computational burden.
従来のレントゲンレポート生成における評価指標は、専門用語の多用により、患者の理解を妨げ、モデルの学習を歪ませる可能性がある。本稿では、専門用語を使わない、わかりやすい言葉で記述されたレポート生成の枠組みを提案し、より正確な評価と、より人間らしい解釈の実現を目指す。
This research proposes a novel framework called Layman's RRG, which leverages layman's terms to improve the generation and evaluation of radiology reports, addressing the limitations of traditional word-overlap metrics and the highly technical nature of medical language.
This paper introduces X-RGen, a novel framework for generating radiology reports across multiple anatomical regions, mimicking the reasoning process of human radiologists to improve accuracy and clinical relevance.
This paper introduces SAE-Rad, a novel approach using sparse autoencoders (SAEs) to generate interpretable radiology reports by decomposing image features into human-understandable concepts, achieving competitive performance with fewer resources compared to traditional VLMs.
Improving the inter-report consistency of radiology report generation by extracting lesions, examining their characteristics, and using a lesion-aware mixup technique to align the representations of semantically equivalent lesions.