Innovative Approach to Zero-Shot Multi-Speaker TTS with Negated Speaker Representations
The author proposes a novel negation feature learning paradigm to improve zero-shot multi-speaker TTS by disentangling speaker attributes and reducing content leakage, leading to enhanced synthesis robustness and speaker fidelity.