Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-976
|View full text |Cite
|
Sign up to set email alerts
|

High-Fidelity Parallel WaveGAN with Multi-Band Harmonic-Plus-Noise Model

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…To address these issues, GAN-based vocoders have been widely explored to take advantage of the compact generator size because the discriminator greatly helps the compact generator achieve high-fidelity speech generation. Parallel Wave-GAN (PWG) [35] and MelGAN [36] are the recent most popular GAN-based vocoders, and many subsequent GAN-based vocoders are based on them [11], [12], [13], [14], [25], [26], [27], [37], [38], [39], [40], [41]. Non-autoregressive models without GAN are also proposed.…”
Section: B Neural Vocoders Based On Generative Modelsmentioning
confidence: 99%
See 4 more Smart Citations
“…To address these issues, GAN-based vocoders have been widely explored to take advantage of the compact generator size because the discriminator greatly helps the compact generator achieve high-fidelity speech generation. Parallel Wave-GAN (PWG) [35] and MelGAN [36] are the recent most popular GAN-based vocoders, and many subsequent GAN-based vocoders are based on them [11], [12], [13], [14], [25], [26], [27], [37], [38], [39], [40], [41]. Non-autoregressive models without GAN are also proposed.…”
Section: B Neural Vocoders Based On Generative Modelsmentioning
confidence: 99%
“…To further improve the F 0 controllability, we introduce F 0 -driven mechanisms designed on the basis of QP-PWG and NSF into the source network. Moreover, inspired by the recent successes of the neural vocoders that adopt harmonic-plus-noise (HN) speech modeling [13], [14], [21], [22], we introduce HN source excitation generation to obtain better sound quality. The overall architecture of uSFGAN is shown in Fig.…”
Section: Unified Source-filter Ganmentioning
confidence: 99%
See 3 more Smart Citations