The paper introduces MR-MT3 as an enhancement to the MT3 model for multi-instrument automatic music transcription. It addresses the issue of instrument leakage by proposing a memory retention mechanism, prior token sampling, and token shuffling. These enhancements are evaluated on the Slakh2100 dataset, showing improved onset F1 scores and reduced instrument leakage. The study also introduces new metrics like the instrument leakage ratio and instrument detection F1 score for comprehensive assessment. The proposed methods aim to maintain musical context across audio segments, improving transcription quality.
Para Outro Idioma
do conteúdo original
arxiv.org
Principais Insights Extraídos De
by Hao Hao Tan,... às arxiv.org 03-18-2024
https://arxiv.org/pdf/2403.10024.pdfPerguntas Mais Profundas