核心概念
This shared task investigates the broader phenomenon of semantic textual relatedness across 14 languages, including low-resource languages from Africa and Asia, and provides datasets, baselines, and evaluation of participating systems.
要約
This shared task focused on the broader concept of semantic textual relatedness (STR), which aims to capture the degree to which two linguistic units (e.g., sentences) are close in meaning. This is in contrast to previous tasks that primarily focused on semantic textual similarity.
The key highlights and insights are:
- The task covered 14 languages from 5 distinct language families, predominantly spoken in Africa and Asia, regions characterized by limited NLP resources.
- The datasets were created by annotating sentence pairs using Best-Worst Scaling, capturing a range of relatedness scores from 0 (completely unrelated) to 1 (maximally related).
- The task included three main tracks: (a) supervised, (b) unsupervised, and (c) crosslingual, attracting 163 participants and 70 final submissions from 51 different teams.
- The top-performing systems used a variety of approaches, including data augmentation, ensemble methods, and leveraging language-specific features, though the methods did not perform equally well across all languages.
- The results show that determining semantic textual relatedness is a non-trivial task, with performance varying across languages, regardless of resource availability.
- The task provides a valuable benchmark for the community to further explore semantic relatedness in multilingual settings, especially for low-resource languages.
統計
"We received 70 submissions in total (across all tasks) from 51 different teams, and 38 system description papers."
"The task attracted 163 participants."
引用
"Two units may be related in a variety of different ways (e.g., by expressing the same view, originating from the same time period, elaborating on each other, etc.). On the other hand, semantic textual similarity (STS) considers only a narrow view of the relationship that may exist between texts (such as equivalence or paraphrase) which does not incorporate other dimensions of relatedness such as entailment, topic or view similarity, or temporal relations."
"Prior shared tasks (Agirre et al., 2012, 2013, 2014, 2015, 2016; Cer et al., 2017a) have mainly focused on textual similarity. In this work, we provide participants with SemRel (Ousidhoum et al., 2024), a collection of 14 newly curated monolingual STR datasets for Afrikaans (afr), Amharic (amh), Modern Standard Arabic (arb), Algerian Arabic (arq), Moroccan Arabic (ary), English (eng), Spanish (esp), Hausa (hau), Hindi (hin), Indonesian (ind), Kinyarwanda (kin), Marathi (mar), Punjabi (pun) and Telugu (tel)."