핵심 개념
A dual BERT architecture with hierarchical label learning is proposed to accurately annotate disease labels in Chinese chest X-ray reports, enabling the construction of a large-scale Chinese chest X-ray report dataset.
초록
This study addresses the lack of Chinese chest X-ray report disease labelers by constructing a Chinese chest X-ray report disease labeler based on a dual BERT architecture and hierarchical label learning algorithm. The labeler effectively encodes diagnostic reports and clinical information independently, and leverages the hierarchical relationship between diseases and body parts to build a hierarchical label learning algorithm, significantly enhancing the accuracy of disease annotation.
Subsequently, a Chinese chest X-ray report dataset (CCXRD) containing 51,262 chest X-ray samples was constructed based on this labeler. Experimental analysis conducted on a Chinese data subset built by experts verified the effectiveness of the proposed disease labeler, outperforming existing models in terms of F1 score, weighted F1 score, Kappa statistic, and weighted Kappa statistic.
The key highlights and insights from the study are:
- The dual BERT architecture allows for independent encoding of diagnostic reports and clinical information, capturing their respective characteristics more effectively.
- The hierarchical label learning algorithm leverages the affiliation between diseases and body parts, improving the text classification performance.
- The constructed CCXRD dataset provides a standardized process and a large-scale resource for research on Chinese chest X-ray report generation.
- Ablation studies and comparisons with various Chinese pre-trained BERT models demonstrate the contributions of the proposed components and the importance of suitable pre-training data.
통계
The dataset constructed in this study, CCXRD, contains a total of 51,262 chest X-ray images and corresponding radiological reports.
The dataset is randomly divided into training, validation, and test sets in an 8:1:1 ratio, with 47,886 samples in the training set, 2,403 samples in the validation set, and 2,404 samples in the test set.
인용구
"The dual BERT encoder used in this study is based on the BERT-Base architecture, with initial weights inherited from MedBERT-kd, and all layers were fine-tuned."
"The results show that the removal of either the hierarchical labels algorithm or the dual BERT encoder led to a significant decrease in F1 score and Kappa statistic."
"The experimental results indicate that, while the general-purpose GPT-3.5 and GPT-4 models demonstrated unexpectedly good performance, with GPT-4 showing a significant improvement in inference capability over GPT-3.5, there is still a gap compared to the method of this study."