The paper introduces a deep learning-based approach to automatically predict the track role of single-instrumental music sequences. The authors explored both the symbolic (MIDI) and audio domains, utilizing fine-tuned pre-trained models for the task.
For the symbolic domain, the authors fine-tuned the MusicBERT model, which was initially trained on large MIDI datasets. For the audio domain, they fine-tuned the PANNs model, which was pre-trained on the AudioSet dataset.
The evaluations showed that the fine-tuned models outperformed their from-scratch counterparts, achieving prediction accuracies of 87% in the symbolic domain and 84% in the audio domain. The authors noted that the models struggled the most in distinguishing between the Main Melody and Sub Melody classes, as well as in correctly identifying the Riff class.
The authors highlighted the potential applications of the automatically predicted track role data, such as efficient sample search and management, as well as advancements in AI-assisted music composition. They also suggested exploring learning strategies like curriculum learning to further improve the performance, especially for the more challenging track role distinctions.
Ke Bahasa Lain
dari konten sumber
arxiv.org
Wawasan Utama Disaring Dari
by Changheon Ha... pada arxiv.org 04-23-2024
https://arxiv.org/pdf/2404.13286.pdfPertanyaan yang Lebih Dalam