Główne pojęcia
Psycholinguistically-aware detector GPT-who outperforms state-of-the-art detectors by 20% using UID-based features.
Streszczenie
The article introduces GPT-who, a novel text detector leveraging the Uniform Information Density (UID) principle to distinguish between texts generated by Large Language Models (LLMs) and humans. By employing psycholinguistically-aware features, GPT-who achieves superior performance across various benchmark datasets compared to existing detectors like GLTR, GPTZero, and OpenAI detector. The method is computationally efficient, interpretable, and capable of accurately attributing authorship even in cases where the text is indiscernible. The study also explores the distribution of UID scores among different LLMs and human-generated texts, highlighting distinct patterns that aid in authorship prediction. Overall, GPT-who presents a promising approach rooted in psycholinguistic theories for detecting machine-generated text effectively.
Statystyki
We evaluate our method using 4 large-scale benchmark datasets and find that GPT-who outperforms state-of-the-art detectors by over 20% across domains.
UID-based measures for all datasets and code are available at https://github.com/saranya-venkatraman/gpt-who.
Cytaty
"GPT-who leverages psycholinguistically motivated representations that capture authors’ information signatures distinctly."
"GPT-who offers a more interpretable representation of its detection behavior."
"Our work indicates that psycholinguistically-inspired tools can hold their ground in the age of LLMs."