Core Concepts
Cross-lingual transfer learning using character-level neural conditional random fields can significantly improve named entity recognition performance in low-resource settings.
Abstract
The paper presents a novel cross-lingual architecture for named entity recognition (NER) that leverages character-level neural conditional random fields (CRFs) to enable transfer learning across related languages.
The key insights are:
In low-resource settings, traditional log-linear CRFs outperform neural CRFs, as neural models require large amounts of training data.
By incorporating cross-lingual information through shared character-level representations, the neural CRF model can learn general entity representations that transfer effectively to low-resource target languages.
Experiments on 15 diverse languages show that the cross-lingual neural CRF approach can improve F1 scores by up to 9.8 points over the log-linear CRF baseline in low-resource settings.
The character-level neural architecture is able to abstract the notion of named entities across related languages, enabling effective cross-lingual transfer learning.
While there is still a gap between the high-resource and transferred low-resource performance, the results demonstrate the viability of the cross-lingual neural CRF approach for reducing the data requirements of state-of-the-art NER models.
Stats
The paper uses a dataset of 15 languages, including Galician, West Frisian, Ukrainian, Marathi, and Tagalog as low-resource target languages, and related source languages such as Spanish, Catalan, Italian, French, Romanian, Dutch, Russian, Cebuano, Hindi, and Urdu.
For each target language, the authors create a low-resource setting with 100 training sentences and a high-resource setting with 10,000 training sentences. The source languages also have 10,000 training sentences each.
Quotes
"Learning character representations for multiple related languages allows transfer among the languages, improving F1 by up to 9.8 points over a log-linear CRF baseline."
"We show empirically that this improves the quality of the resulting model."
"With experiments on 15 languages, we confirm that feature-based CRFs outperform the neural methods consistently in the low-resource training scenario. However, with the addition of cross-lingual information, the tables turn and the neural methods are again on top, demonstrating that cross-lingual supervision is a viable method to reduce the training data state-of-the-art neural approaches require."