Core Concepts
This paper introduces PediatricsGPT, a new large language model specifically trained on a massive dataset of Chinese pediatric medical texts to address the shortage of pediatricians and improve healthcare access in China.
Stats
PedCorpus contains over 300,000 multi-task instructions.
The researchers used Baichuan2-Base models in two versions, with 7 and 13 billion parameters.
PediatricsGPT-7B showed a 3.53% and 4.44% improvement on ROUGE-L and GLEU metrics, respectively, compared to HuatuoGPT-II in the EviDiag task.
The MCE strategy with three specific experts achieved a reasonable performance trade-off across three tasks using only 0.95% trainable parameters.
Quotes
"PediatricsGPT is developed on a systematic training pipeline that includes Continuous Pre-Training (CPT), full-parameter SFT, human preference alignment, and parameter-efficient secondary SFT."
"In this case, we introduce a hybrid instruction pre-training mechanism in CPT to bridge the capability weakening due to corpus format discrepancies between the internal and injected medical knowledge of foundation models, facilitating knowledge accumulation and extension."
"Despite impressive improvements achieved by RLHF-based approaches [53, 55], challenges remain due to unstable reward modelling and significant computational costs [39, 58]."