Core Concepts
Multilingual language models exhibit varying degrees of cross-lingual transfer performance and robustness to adversarial perturbations, depending on the linguistic relationship between the high-resource and low-resource language pairs.
Abstract
The study investigates the cross-lingual transfer capabilities and robustness of two well-known multilingual language models, MBERT and XLM-R, on Named Entity Recognition (NER) and Section Title Prediction tasks across 13 language pairs. The language pairs were selected to have varying degrees of vocabulary overlap due to areal, genetic, or borrowing relationships.
The key findings are:
There is a strong correlation between the degree of vocabulary overlap between the high-resource language (HRL) and low-resource language (LRL) and the performance of cross-lingual transfer on NER. Perturbing named entities so that the test data contains only non-overlapping words has a significant impact on model performance.
While cross-lingual transfer models generally perform worse than native LRL models, they are often more robust to certain types of input perturbations, such as changing context words around named entities.
Section title prediction, as a proxy for document classification, appears to rely heavily on word memorization in LRLs, with performance degrading significantly when common words are substituted, even with semantically similar replacements.
The results suggest that multilingual models may be encoding biases toward high-resource languages, and that their performance on low-resource languages is sensitive to minor changes in the input, highlighting the need for more equitable consideration of diverse languages in NLP.
Stats
"Perturbing named entities so that the test data contains only non-overlapping words has a statistically very significant impact on model performance."
"Cross-lingual transfer models are often somewhat more robust to certain types of perturbations of the input."
"Title selection, as a proxy for document classification, in LRLs appears to heavily rely on word memorization."
Quotes
"There is a pronounced effect of vocabulary overlap on NER performance."
"Although models utilizing cross-lingual transfer typically exhibit lower numerical performance than models trained in a native LRL setting, they are often somewhat more robust to certain types of perturbations of the input."
"Title selection, as a proxy for document classification, in LRLs appears to heavily rely on word memorization."