Kumar, S., Sahu, M., Gacche, V., Ghosal, T., & Ekbal, A. (2024). ‘Quis custodiet ipsos custodes?’ Who will watch the watchmen? On Detecting AI-generated Peer Reviews. arXiv preprint arXiv:2410.09770.
This paper addresses the emerging challenge of detecting AI-generated peer reviews, a critical issue for upholding the integrity of scientific publishing in the age of increasingly sophisticated large language models (LLMs). The authors aim to develop and evaluate effective methods for distinguishing between human-written and AI-generated peer reviews.
The researchers introduce two novel approaches for AI-generated peer review detection:
The team created a dataset of 1,480 papers from ICLR and NeurIPS conferences, generating AI-written reviews using GPT-4 and GPT-3.5. They evaluated the performance of their proposed models against existing AI text detectors (RADAR, LLMDet, DEEP-FAKE, and Fast-Detect GPT) using metrics like precision, recall, F1-score, and accuracy. Additionally, they investigated the robustness of these detectors against adversarial attacks, including token manipulation (specifically adjective replacement) and paraphrasing, proposing a defense mechanism against the latter.
The study highlights the feasibility of detecting AI-generated peer reviews using relatively simple yet effective methods like the proposed TF and RR models. The authors emphasize the importance of developing robust detection techniques that can withstand adversarial attacks, particularly as LLMs become increasingly sophisticated.
This research contributes significantly to the field of AI-generated text detection, specifically addressing the novel challenge posed by AI-generated peer reviews. The findings have crucial implications for safeguarding the integrity of the scientific peer-review process, ensuring that published research maintains its rigor and reliability.
The study primarily focused on GPT-4 and GPT-3.5 for generating AI-written reviews. Future research should explore the effectiveness of these methods on reviews generated by other LLMs and investigate their applicability across various scientific domains. Additionally, exploring the detection of partially AI-generated reviews, where reviewers might use AI to expand on human-written bullet points, presents a promising avenue for future work.
Vers une autre langue
à partir du contenu source
arxiv.org
Idées clés tirées de
by Sandeep Kuma... à arxiv.org 10-15-2024
https://arxiv.org/pdf/2410.09770.pdfQuestions plus approfondies