Efficient Adaptation of Text-to-Speech Models to New Speakers Using Hypernetworks
HYPERTTS, a parameter-efficient approach for adapting text-to-speech models to new speakers, utilizes a hypernetwork to dynamically generate adapter parameters conditioned on speaker representations, outperforming static adapter-based methods while achieving comparable performance to full fine-tuning.