The paper presents the PORTULAN ExtraGLUE benchmark, which consists of machine-translated versions of several well-known English language understanding tasks, including single sentence tasks, similarity tasks, inference tasks, question-answering tasks, and reasoning tasks. The authors discuss the challenges and limitations of using machine translation to create these datasets, such as issues with pronoun resolution, gendered nouns, and named entity translation.
To validate the datasets, the authors fine-tune low-rank adaptations (LoRA) of the Albertina language model, a state-of-the-art open encoder model for Portuguese, on 10 of the PORTULAN ExtraGLUE tasks. The resulting fine-tuned models are made available as baselines for future research.
The authors compare the performance of the Albertina LoRA models on the PORTULAN ExtraGLUE datasets to the performance of the multilingual XLM-RoBERTa-XL model and the English DeBERTa-V2-XXLarge model. While the Albertina LoRA models lag behind the English model, they outperform the multilingual model, demonstrating the benefits of using a monolingual model for Portuguese.
The authors acknowledge the limitations of machine-translated datasets and call for future work to improve the benchmark through manual curation and the development of new datasets from scratch to better reflect the Portuguese language and its cultural nuances.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Tomá... lúc arxiv.org 04-09-2024
https://arxiv.org/pdf/2404.05333.pdfYêu cầu sâu hơn