2022
DOI: 10.3390/info13030103
|View full text |Cite
|
Sign up to set email alerts
|

Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet

Abstract: The use of the mel spectrogram as a signal parameterization for voice generation is quite recent and linked to the development of neural vocoders. These are deep neural networks that allow reconstructing high-quality speech from a given mel spectrogram. While initially developed for speech synthesis, now neural vocoders have also been studied in the context of voice attribute manipulation, opening new means for voice processing in audio production. However, to be able to apply neural vocoders in real-world app… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 60 publications
0
2
0
Order By: Relevance
“…To convert the mel-spectrograms to audio for the perceptual test, we use the neural vocoder of [34] which has been shown to work particularly well on singing voice. We use the same universal voice model (trained on speech and singing voice) for synthesis of both speech and singing voice.…”
Section: Audio Synthesismentioning
confidence: 99%
“…To convert the mel-spectrograms to audio for the perceptual test, we use the neural vocoder of [34] which has been shown to work particularly well on singing voice. We use the same universal voice model (trained on speech and singing voice) for synthesis of both speech and singing voice.…”
Section: Audio Synthesismentioning
confidence: 99%
“…If the disentanglement has been successful, the decoder will use the new intensity contour to synthesise a mel-spectrogram with the original properties but with the desired intensity. The mel-spectrograms are inverted with the mel-inverter from [17].…”
Section: Proposed Intensity Transformationsmentioning
confidence: 99%