핵심 개념
The author explores the effectiveness of cross-lingual transfer learning for fact-checking in low-resource languages, focusing on Turkish, to address the scarcity of datasets and advance research in non-English languages.
초록
The study introduces a new Turkish fact-checking dataset, FCTR, with 3238 claims from three Turkish fact-checking organizations. It evaluates the performance of large language models through zero-shot and few-shot learning approaches. The results show that fine-tuning models with Turkish data yields superior results compared to zero-shot and few-shot approaches.
The rapid spread of misinformation on social media platforms has raised concerns about its impact on public opinion. Automated fact-checking methods aim to assess the truthfulness of claims while reducing human intervention. Cross-lingual transfer learning is explored as a solution for building fact-checking systems in low-resource languages.
Datasets for fact-checking have emerged primarily in English, creating an imbalance between languages. Leveraging large datasets in English and cross-lingual transfer learning can help build fact-checking systems for other languages efficiently. The study highlights the importance of utilizing native data for successful model development.
Key metrics or figures:
- FCTR dataset consists of 3238 real-world claims.
- Llama models exhibit superior performance compared to mBERT and SVM models.
- Llama models achieve higher F1-macro scores when provided with evidence statements.
- Few-shot learning slightly improves model performance over zero-shot learning.
- Fine-tuning with Turkish data yields better results than zero-shot or few-shot approaches.
통계
The FCTR dataset consists of 3238 real-world claims.
Llama models exhibit superior performance compared to mBERT and SVM models.
Llama models achieve higher F1-macro scores when provided with evidence statements.
Few-shot learning slightly improves model performance over zero-shot learning.
Fine-tuning with Turkish data yields better results than zero-shot or few-shot approaches.
인용구
"The experimental results indicate that the dataset has the potential to advance research in the Turkish language."
"Automated methods for fact-checking have emerged to assess the truthfulness of claims while reducing human intervention."
"The study highlights the importance of utilizing native data for successful model development."