Core Concepts
A single massively multilingual neural machine translation model that leverages transfer learning across languages to achieve significant improvements in translation quality across 200 languages, including low-resource languages.
Abstract
The content discusses the development of a novel neural machine translation (NMT) system that can scale to 200 languages, including low-resource languages. The key points are:
The development of neural techniques has opened up new avenues for research in machine translation, enabling highly multilingual capacities and even zero-shot translation.
However, scaling quality NMT requires large volumes of parallel bilingual data, which are not equally available for the 7,000+ languages in the world, leading to a focus on improving translation for a small group of high-resource languages and exacerbating digital inequities.
To address this, the authors introduce "No Language Left Behind" - a single massively multilingual model that leverages transfer learning across languages.
The model is based on the Sparsely Gated Mixture of Experts architecture and is trained on data obtained using new mining techniques tailored for low-resource languages.
The authors also devised multiple architectural and training improvements to counteract overfitting while training on thousands of tasks.
The model was evaluated using an automatic benchmark (FLORES-200), a human evaluation metric (XSTS), and a toxicity detector, and achieved an average of 44% improvement in translation quality as measured by BLEU compared to previous state-of-the-art models.
By demonstrating how to scale NMT to 200 languages and making all contributions freely available for non-commercial use, the work lays important groundwork for the development of a universal translation system.
Stats
The model was evaluated on 40,000 translation directions.
The model achieved an average of 44% improvement in translation quality as measured by BLEU compared to previous state-of-the-art models.
Quotes
"To break this pattern, here we introduce No Language Left Behind—a single massively multilingual model that leverages transfer learning across languages."
"By demonstrating how to scale NMT to 200 languages and making all contributions in this effort freely available for non-commercial use, our work lays important groundwork for the development of a universal translation system."