This paper rethinks the roles of Large Language Models (LLMs) in the Chinese Grammatical Error Correction (CGEC) task. The authors observe that while LLMs have strong language understanding capabilities, their performance as direct correctors on CGEC remains unsatisfactory due to the minimum change principle.
To address this, the authors propose two novel frameworks:
Explanation-Augmented Training (EXAM): EXAM utilizes LLMs as "explainers" to provide auxiliary information such as error types, reference corrections, and explanations for grammatical errors. This information is then used to enhance the training of small CGEC models, enabling them to outperform LLMs on traditional metrics.
Semantic-Incorporated Evaluation (SEE): SEE employs LLMs as "evaluators" to assess CGEC model outputs more comprehensively by considering both grammatical correctness and semantic preservation. Unlike traditional metrics that rely on exact text matching, SEE evaluates the validity of edits more flexibly based on LLMs' grammatical analysis and semantic understanding.
Extensive experiments on widely used CGEC datasets demonstrate the effectiveness of the proposed EXAM and SEE frameworks. The results show that small models trained with EXAM can achieve performance on par with or better than LLMs, especially when evaluated using the more holistic SEE metric. This suggests that LLMs and small models can effectively collaborate, with each leveraging their respective strengths to advance the CGEC field.
The authors also provide detailed analyses on the impact of different types of explanation information in EXAM, the role of golden annotation data, and the alignment of SEE evaluation with human judgments. These insights shed light on how LLMs and small models can coexist and progress together in the era of large language models.
Vers une autre langue
à partir du contenu source
arxiv.org
Questions plus approfondies