toplogo
Connexion

Accurate Sorani Kurdish Subdialect Recognition through Deep Learning Models


Concepts de base
The core message of this article is to develop an effective dataset and deep learning models for accurately recognizing and classifying different Sorani Kurdish subdialects.
Résumé
The article presents a comprehensive approach to creating a dataset and developing deep learning models for Sorani Kurdish subdialect recognition. Key highlights: The researchers conducted field visits to collect speech data from 107 speakers across 6 Sorani Kurdish subdialects (Garmiani, Hewleri, Karkuki, Pishdari, Sulaimani, Khoshnawi), resulting in 29 hours, 16 minutes and 40 seconds of audio recordings. The dataset, named Sorani Nas, was preprocessed and segmented into different durations (1s, 3s, 5s, 10s, 30s) to evaluate model performance. Three deep learning models were adapted and extensively experimented with: Artificial Neural Network (ANN), Convolutional Neural Network (CNN), and Recurrent Neural Network with Long Short-Term Memory (RNN-LSTM). The experiments explored various configurations, including dataset splitting ratios, handling imbalanced datasets through oversampling and undersampling techniques. The RNN-LSTM model achieved the highest accuracy of 96% on the oversampled 5-second segment dataset with an 80:10:10 split. CNN also performed well, reaching 93% accuracy. The study highlights the challenges in Sorani Kurdish subdialect recognition and demonstrates the effectiveness of deep learning approaches in accurately classifying the subdialects. Future research directions include exploring additional Kurdish dialects and further improving the models.
Stats
The Sorani Nas dataset contains 29 hours, 16 minutes and 40 seconds of audio recordings from 107 speakers across 6 Sorani Kurdish subdialects. The dataset was segmented into different durations (1s, 3s, 5s, 10s, 30s) to evaluate model performance.
Citations
"Creating an audio dataset is critical for training and assessing machine learning models for subdialect categorisation." "The RNN-LSTM model achieved the highest accuracy of 96% on the oversampled 5-second segment dataset with an 80:10:10 split."

Questions plus approfondies

What other Kurdish dialects could be explored in future research to expand the scope of subdialect recognition

In future research, exploring other Kurdish dialects could significantly expand the scope of subdialect recognition. Some potential Kurdish dialects to consider include Kurmanji, which is spoken in parts of Turkey, Syria, and Armenia, as well as Zazaki, which is spoken in eastern Turkey and parts of Iran. Additionally, the Gorani dialect, spoken in parts of Iraq and Iran, could provide valuable insights into the diversity of Kurdish subdialects. By including these dialects in research, a more comprehensive understanding of Kurdish linguistic diversity can be achieved.

How can the deep learning models be further improved to achieve even higher accuracy and robustness in Sorani Kurdish subdialect classification

To further improve the deep learning models for Sorani Kurdish subdialect classification, several strategies can be implemented. Firstly, increasing the diversity and size of the training dataset can enhance the model's ability to generalize to different subdialects. Additionally, fine-tuning the model hyperparameters, such as learning rate, batch size, and network architecture, can optimize performance. Incorporating advanced techniques like transfer learning from pre-trained models and ensembling multiple models can also boost accuracy and robustness. Regular model evaluation and validation on unseen data are crucial to ensure the model's reliability and effectiveness in real-world applications.

How can the collected speech data be leveraged to develop practical applications, such as voice-based assistants or language learning tools, that can benefit Sorani Kurdish speakers

The collected speech data can be leveraged to develop practical applications that benefit Sorani Kurdish speakers in various ways. One potential application is the development of a voice-based assistant that understands and responds to commands in Sorani Kurdish, providing users with hands-free access to information, reminders, and assistance. Language learning tools can also be created using the speech data to help individuals improve their pronunciation, vocabulary, and conversational skills in Sorani Kurdish. These tools can include interactive exercises, quizzes, and feedback mechanisms to enhance language proficiency. Additionally, the speech data can be used to create speech-to-text and text-to-speech systems that facilitate communication and accessibility for Sorani Kurdish speakers in digital platforms and devices.
0
visual_icon
generate_icon
translate_icon
scholar_search_icon
star