This paper presents a novel annotated corpus, the Pediatric Social History Annotation Corpus (PedSHAC), which contains 1,260 annotated social history sections from pediatric patient notes. The corpus captures 10 distinct social determinants of health (SDoH) categories, including living and economic stability, prior trauma, education access, substance use history, and mental health, with an overall annotator agreement of 81.9 F1.
The authors explore various large language model-based information extraction strategies, including fine-tuning BERT, T5, and in-context learning with GPT-4. The fine-tuned T5-2sQA model achieves the highest performance, with a micro-average F1 of 74.7% at the event-level extraction. The GPT-4 in-context learning approach with 3-shot examples (+guide) demonstrates comparable trigger extraction performance to the fine-tuned models, with an F1 of 82.3%.
The results show that detailed SDoH representations can be extracted from pediatric clinical narratives with performance approaching human-level agreement. This enables the systematic collection and utilization of SDoH information in clinical and research settings, which can support data-driven interventions to improve individual and public health outcomes for pediatric populations.
إلى لغة أخرى
من محتوى المصدر
arxiv.org
الرؤى الأساسية المستخلصة من
by Yujuan Fu,Gi... في arxiv.org 04-02-2024
https://arxiv.org/pdf/2404.00826.pdfاستفسارات أعمق