This study explores the applicability of Large Language Models (LLMs), particularly ChatGPT, to citation context analysis. Citation context analysis involves categorizing the contextual information of individual citations in research papers, such as the location and semantic content of citations. However, this analysis requires significant manual annotation, which hinders its widespread use.
The study compared the annotation results of ChatGPT and human annotators for two key categories in citation context analysis: citation purpose and citation sentiment. The results showed that while ChatGPT outperformed human annotators in terms of consistency, its predictive performance was poor compared to the human-annotated gold standard.
The authors suggest that it is not appropriate to immediately replace human annotators with ChatGPT in citation context analysis. However, the annotation results obtained by ChatGPT can be used as reference information when narrowing down the annotation results obtained by multiple human annotators to a single dataset. Additionally, ChatGPT can be used as one of the annotators when it is difficult to secure a sufficient number of human annotators.
The study provides important insights for the future development of citation context analysis, highlighting the current limitations of LLMs and potential ways to leverage them to support and complement human annotation efforts.
翻譯成其他語言
從原文內容
arxiv.org
深入探究