מושגי ליבה
The authors developed two models, TranSem and FineSem, to identify semantic textual relatedness between sentence pairs in 14 African and Asian languages, exploring the effectiveness of machine translation and different training strategies.
תקציר
The authors participated in the SemEval-2024 Task 1 on Semantic Textual Relatedness for African and Asian Languages. They developed two models, TranSem and FineSem, to address the task:
TranSem Model:
- Uses a Siamese network architecture to encode sentence pairs and train a cosine similarity loss to match the semantic relatedness score.
- Experiments with various sentence encoding models, including DistilRoBERTa, and finds that mean pooling works well.
- Explores the usefulness of machine translation by translating the training data to English using multiple translation models.
FineSem Model:
- Fine-tunes T5 models on the semantic textual similarity (STS) task, using both the untranslated and translated training data.
- Compares the performance of individual T5 models fine-tuned on each language, a unified T5 model trained on all languages, and a T5 model trained on the translated and augmented data.
- Finds that direct fine-tuning with the translated and augmented data is comparable to the TranSem model using various sentence embeddings.
For the cross-lingual Track C, the authors use the T5 models fine-tuned on the English and Spanish datasets to evaluate the other languages.
The authors' models outperform the official baseline for some languages in both the supervised and cross-lingual settings. They also explore the effectiveness of machine translation and find that it can lead to better performance for certain languages.
סטטיסטיקה
The authors used a batch size of 32 for the TranSem model and 16 for the FineSem model.
Mean pooling performed better than max pooling and CLS token pooling for the TranSem model.
The FineSem model trained on the translated and augmented data performed comparably to the TranSem model using various sentence embeddings.