서로 다른 언어로 작성된 문단 간 정보 차이를 세부적으로 식별하고 분석하는 것이 핵심 목표이다.
The core message of this paper is to introduce a new cross-lingual Natural Language Inference (NLI) dataset for Basque, a low-resource language, and to analyze the impact of different cross-lingual strategies and data sources on the performance of NLI models for Basque.
This paper presents a system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness (STR), on Track C: Cross-lingual. The task aims to detect semantic relatedness of two sentences in a given target language without access to direct supervision. The authors focus on different source language selection strategies on two different pre-trained language models: XLM-R and FURINA.
Leveraging word alignment models to explicitly align semantically equivalent words between high-resource and low-resource languages can enhance cross-lingual sentence embeddings, particularly for low-resource languages.
Contextual Label Projection (CLaP) is a novel approach that leverages contextual machine translation to accurately translate labels while preserving their association with the translated input text, leading to improved performance in cross-lingual structured prediction tasks.
The proposed Lottery Ticket Prompt-learning (LTP) framework selectively prompts a subset of the model's parameters to effectively adapt small-sized language models to cross-lingual tasks, especially for low-resource languages.