The paper introduces MR-MT3 as an enhancement to the MT3 model for multi-instrument automatic music transcription. It addresses the issue of instrument leakage by proposing a memory retention mechanism, prior token sampling, and token shuffling. These enhancements are evaluated on the Slakh2100 dataset, showing improved onset F1 scores and reduced instrument leakage. The study also introduces new metrics like the instrument leakage ratio and instrument detection F1 score for comprehensive assessment. The proposed methods aim to maintain musical context across audio segments, improving transcription quality.
In un'altra lingua
dal contenuto originale
arxiv.org
Approfondimenti chiave tratti da
by Hao Hao Tan,... alle arxiv.org 03-18-2024
https://arxiv.org/pdf/2403.10024.pdfDomande più approfondite