Core Concepts
This study introduces a novel annotated corpus, CAMIR, which combines granular event-based annotations with concept normalization to comprehensively capture clinical findings from radiology reports. Two BERT-based information extraction models, mSpERT and PL-Marker++, are developed and evaluated on the CAMIR dataset, demonstrating performance comparable to human-level agreement.
Abstract
The authors present a novel annotated corpus called the Corpus of Annotated Medical Imaging Reports (CAMIR), which includes 609 radiology reports from Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and Positron Emission Tomography-Computed Tomography (PET-CT) modalities. The reports are annotated using a granular event-based schema that captures clinical indications, lesions, and medical problems, with most arguments normalized to predefined SNOMED-CT concepts.
The annotation process involved four medical students, with guidance from senior radiology experts. The corpus exhibits high inter-annotator agreement, exceeding 0.70 F1 for most argument types. Exceptions include Size Trend, Count, and Characteristic, which are relatively infrequent or linguistically diverse.
To extract the CAMIR events, the authors explored two BERT-based language models: mSpERT, which jointly extracts all event information, and PL-Marker++, a multi-stage approach that the authors augmented for the CAMIR schema. PL-Marker++ achieved the highest overall performance, significantly outperforming mSpERT, with an F1 score of 0.759 on the held-out test set.
The authors discuss the quality of the annotations, the model performance, and the validation of the span overlap evaluation criterion used. They also highlight the potential for CAMIR to support a wide range of secondary-use applications in the radiology domain, such as cohort discovery, epidemiology, image retrieval, automated follow-up tracking, computer-vision applications, decision support, and report summarization.
Stats
"Bilateral apical lung scarring" (Indication Anatomy)
"up to 5mm" (Lesion Size)
"multiple" (Lesion Count)
"New" (Lesion Size Trend)
Quotes
"CAMIR uniquely combines a granular event structure and concept normalization."
"PL-Marker++ achieved significantly higher overall performance than mSpERT (0.759 F1 vs 0.736 F1)."