The uaMix-MAE strategy introduces an efficient ID tuning approach that leverages unsupervised audio mixtures to align representations of pretrained MAEs, facilitating adaptation to task-specific semantics. By combining Instance Discrimination (ID) and Masked Autoencoders (MAEs), uaMix-MAE addresses the challenge of downstream tasks with constrained labeled data. The method optimizes the model using contrastive tuning and proposes an audio mixing technique to manipulate audio samples in both input and virtual label spaces. Experimental results show that uaMix-MAE achieves significant accuracy improvements over various benchmarks in low/few-shot scenarios.
toiselle kielelle
lähdeaineistosta
arxiv.org
Tärkeimmät oivallukset
by Afrina Tabas... klo arxiv.org 03-15-2024
https://arxiv.org/pdf/2403.09579.pdfSyvällisempiä Kysymyksiä