The uaMix-MAE strategy introduces an efficient ID tuning approach that leverages unsupervised audio mixtures to align representations of pretrained MAEs, facilitating adaptation to task-specific semantics. By combining Instance Discrimination (ID) and Masked Autoencoders (MAEs), uaMix-MAE addresses the challenge of downstream tasks with constrained labeled data. The method optimizes the model using contrastive tuning and proposes an audio mixing technique to manipulate audio samples in both input and virtual label spaces. Experimental results show that uaMix-MAE achieves significant accuracy improvements over various benchmarks in low/few-shot scenarios.
לשפה אחרת
מתוכן המקור
arxiv.org
תובנות מפתח מזוקקות מ:
by Afrina Tabas... ב- arxiv.org 03-15-2024
https://arxiv.org/pdf/2403.09579.pdfשאלות מעמיקות