High-Fidelity Vocoder with Time-Frequency Representation Discriminators
This study proposes novel time-frequency representation discriminators, including Multi-Scale Sub-Band Constant-Q Transform (MS-SB-CQT) Discriminator and Multi-Scale Temporal-Compressed Continuous Wavelet Transform (MS-TC-CWT) Discriminator, to improve the synthesis quality of GAN-based vocoders.