The study focuses on the problem of detecting fake reviews in the Bengali language, which is an under-explored research area. The key highlights are:
Creation of the BFRD dataset: The authors collected 9,049 food-related reviews in Bengali from social media platforms, of which 1,339 were annotated as fake and 7,710 as non-fake by expert annotators. This is the first publicly available dataset for Bengali fake review detection.
Text conversion pipeline: The authors developed a unique pipeline that translates English words to their Bengali equivalents and back-transliterates Romanized Bengali to Bengali, to handle the code-mixed nature of the reviews.
Text augmentation: To address the class imbalance problem, the authors utilized text augmentation techniques such as token replacement, back-translation, and paraphrasing to increase the number of fake review instances.
Ensemble model: The authors proposed a weighted ensemble model that combines four pre-trained Bengali language models: BanglaBERT Base, BanglaBERT, BanglaBERT Large, and BanglaBERT Generator. This ensemble approach outperformed individual models and other deep learning techniques.
Extensive experimentation and analysis: The authors conducted rigorous experiments to compare the performance of various deep learning and transformer-based models. They also employed the LIME text explainer framework to provide explanations for the model's predictions and analyzed the misclassification categories.
The proposed ensemble model achieved a weighted F1-score of 0.9843 on the BFRD dataset, demonstrating its effectiveness in detecting fake Bengali reviews.
To Another Language
from source content
arxiv.org
ข้อมูลเชิงลึกที่สำคัญจาก
by G. M. Shahar... ที่ arxiv.org 05-07-2024
https://arxiv.org/pdf/2308.01987.pdfสอบถามเพิ่มเติม