This paper introduces a novel approach called Multilingual Translation-Augmented BERT (MTAB) for zero-shot cross-lingual stance detection. The key highlights are:
MTAB employs two levels of data augmentation - translation augmentation and adversarial language adaptation - to enhance the performance of a cross-lingual classifier in the absence of explicit training data for target languages.
The translation augmentation component expands the English training dataset by incorporating translations into the target languages (French, German, Italian), enabling the model to learn common patterns, sentiment expressions and stance cues that transcend linguistic boundaries.
The adversarial language adaptation component further adapts the multilingual encoder to the target languages by leveraging unlabeled data, preserving information learned from the English training data.
Experiments on vaccine hesitancy datasets in four languages (English, French, German, Italian) demonstrate the effectiveness of the MTAB approach, outperforming a strong baseline model as well as ablated versions of the proposed model.
The translation-augmented data and the adversarial learning component are shown to be key contributors to the improved performance of the MTAB model.
This work establishes a benchmark and opens up a novel research direction into zero-shot cross-lingual stance detection, which is critical for lesser-resourced languages where labeled data is scarce or unavailable.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Bharathi A,A... at arxiv.org 04-23-2024
https://arxiv.org/pdf/2404.14339.pdfDeeper Inquiries