Tang, Zixin, & van Hell, Janet G. (2024). Learning to Write Rationally: How Information Is Distributed in Non-Native Speakers’ Essays. arXiv preprint arXiv:2411.03550v1 [cs.CL].
This research investigates how non-native English speakers with diverse native language backgrounds distribute information in their L2 English essays and how these patterns relate to their L2 proficiency.
The study analyzed a corpus of essays written by L2 English learners from the TOEFL11 corpus and native English speakers from the ICNALE corpus. Using the GPT-2 language model, the researchers extracted information-based metrics: surprisal, entropy, and Uniform Information Density (UID) score. Linear mixed-effects models and ANOVA analyses were employed to examine the relationship between these metrics, L1 background, and L2 proficiency.
The study suggests that while L2 learners acquire more native-like information distribution patterns with increasing proficiency, the ability to distribute information evenly appears to be a more general language production skill, less influenced by L1 background or L2 proficiency.
This research contributes to a deeper understanding of L2 writing development and the cognitive mechanisms underlying information distribution in language production. It highlights the potential of computational linguistics methods for analyzing and assessing L2 writing.
Limitations include the lack of detailed information on language background and experience in the dataset and the potential underestimation of local fluctuations in information distribution. Future research could explore the relationship between computational metrics and traditional linguistic features and investigate the impact of specific language learning experiences on information distribution patterns.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Zixin Tang, ... at arxiv.org 11-07-2024
https://arxiv.org/pdf/2411.03550.pdfDeeper Inquiries