MELA is a comprehensive benchmark for linguistic acceptability judgment, covering 10 languages with 48K samples. It highlights the significance of in-language training data for accurate judgments and improved performance on syntax-related tasks. The study explores cross-lingual transfer and fine-tuning effects on Large Language Models (LLMs). Results show GPT-4 performing comparably to XLM-R, emphasizing the need for in-language training data. Probing experiments reveal enhanced syntax capacity acquisition through MELA training. The dataset aims to facilitate research on multilingual language models and syntactic competence acquisition.
他の言語に翻訳
原文コンテンツから
arxiv.org
深掘り質問