Core Concepts
Large language models (LLMs) can effectively detect boundaries between human-written and machine-generated content within mixed text sequences.
Abstract
The paper explores the capability of LLMs to detect boundaries in human-machine mixed text. The key points are:
-
The task is formulated as a token classification problem, where the label turning point represents the boundary between human-written and machine-generated content.
-
Experiments are conducted using LLMs known for handling long-range dependencies, such as Longformer, XLNet, and BigBird. The results show that XLNet-large outperforms other models, achieving first place in the SemEval'24 competition.
-
The paper investigates factors that influence the boundary detection performance of LLMs, including:
- Incorporating additional layers (LSTM, BiLSTM, CRF) on top of LLMs
- Utilizing segment-based loss functions (BCE-dice loss, Combo loss, BCE-MAE loss) to better capture the transition between segments
- Pretraining the LLM on related tasks (sentence-level boundary detection, binary human-machine text classification) before fine-tuning on the target task
-
The findings provide valuable insights for future research on improving LLMs' capabilities in detecting boundaries within human-machine mixed text.
Stats
The dataset consists of 3,649 training cases and 505 development cases, with an average text length of 263 and 230 words, respectively.
The maximum text length is 1,397 words in the training set and 773 words in the development set.
The average boundary index is 71 in the training set and 68 in the development set.
Quotes
"The objective is to accurately determine the transition point between the human-written and LLM-generated sections."
"Notably, by leveraging an ensemble of multiple LLMs to harness the robust of the model, we achieved first place in Task 8 of SemEval'24 competition."
"Our experiments indicate that optimizing these factors can lead to significant enhancements in boundary detection performance."