The paper introduces MR-MT3 as an enhancement to the MT3 model for multi-instrument automatic music transcription. It addresses the issue of instrument leakage by proposing a memory retention mechanism, prior token sampling, and token shuffling. These enhancements are evaluated on the Slakh2100 dataset, showing improved onset F1 scores and reduced instrument leakage. The study also introduces new metrics like the instrument leakage ratio and instrument detection F1 score for comprehensive assessment. The proposed methods aim to maintain musical context across audio segments, improving transcription quality.
Til et annet språk
fra kildeinnhold
arxiv.org
Viktige innsikter hentet fra
by Hao Hao Tan,... klokken arxiv.org 03-18-2024
https://arxiv.org/pdf/2403.10024.pdfDypere Spørsmål