Exploring Prosody-Aware VITS for Emotional Voice Conversion
Prosody-aware VITS (PAVITS) is proposed to enhance emotional voice conversion by addressing content and emotional naturalness through an end-to-end architecture inspired by VITS, integrating acoustic converter and vocoder seamlessly.