The article presents a comprehensive approach to creating a dataset and developing deep learning models for Sorani Kurdish subdialect recognition. Key highlights:
The researchers conducted field visits to collect speech data from 107 speakers across 6 Sorani Kurdish subdialects (Garmiani, Hewleri, Karkuki, Pishdari, Sulaimani, Khoshnawi), resulting in 29 hours, 16 minutes and 40 seconds of audio recordings.
The dataset, named Sorani Nas, was preprocessed and segmented into different durations (1s, 3s, 5s, 10s, 30s) to evaluate model performance.
Three deep learning models were adapted and extensively experimented with: Artificial Neural Network (ANN), Convolutional Neural Network (CNN), and Recurrent Neural Network with Long Short-Term Memory (RNN-LSTM).
The experiments explored various configurations, including dataset splitting ratios, handling imbalanced datasets through oversampling and undersampling techniques.
The RNN-LSTM model achieved the highest accuracy of 96% on the oversampled 5-second segment dataset with an 80:10:10 split. CNN also performed well, reaching 93% accuracy.
The study highlights the challenges in Sorani Kurdish subdialect recognition and demonstrates the effectiveness of deep learning approaches in accurately classifying the subdialects.
Future research directions include exploring additional Kurdish dialects and further improving the models.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Sana Isam,Ho... at arxiv.org 04-02-2024
https://arxiv.org/pdf/2404.00124.pdfDeeper Inquiries