Combining linguistic features and language model embeddings can effectively distinguish machine-generated text from human-written text, even across unseen language models and domains.
This work addresses the task of detecting the boundary between human-written and machine-generated parts in texts that combine both. The authors evaluate several approaches, including perplexity-based methods, topological data analysis, and fine-tuned language models, on the RoFT and RoFT-chatgpt datasets. They find that perplexity-based classifiers outperform fine-tuned language models in cross-domain and cross-model settings, and analyze the properties of the data that influence the performance of different detection methods.
A robust and accurate system for detecting machine-generated text from multiple generators across different domains.
The performance of minimum Bayes-risk (MBR) decoding varies significantly depending on the sampling method used to generate pseudo-references, and this variation is closely linked to how well the samples approximate the true distribution of references.
A novel interface is developed to facilitate the collection of adversarial human-written trivia questions that challenge question-answering AI models, with the goal of improving their natural language understanding and reasoning capabilities.