Tango 2: Improving Text-to-Audio Generation through Direct Preference Optimization
Tango 2, a text-to-audio generation model, outperforms existing models like Tango and AudioLDM2 by leveraging direct preference optimization (DPO) on a synthetically created preference dataset, Audio-alpaca.