2017
DOI: 10.3390/app7121313
|View full text |Cite
|
Sign up to set email alerts
|

A Neural Parametric Singing Synthesizer Modeling Timbre and Expression from Natural Songs

Abstract: Abstract:We recently presented a new model for singing synthesis based on a modified version of the WaveNet architecture. Instead of modeling raw waveform, we model features produced by a parametric vocoder that separates the influence of pitch and timbre. This allows conveniently modifying pitch to match any target melody, facilitates training on more modest dataset sizes, and significantly reduces training and generation times. Nonetheless, compared to modeling waveform directly, ways of effectively handling… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
133
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 92 publications
(135 citation statements)
references
References 18 publications
1
133
1
Order By: Relevance
“…Our proposed system uses 64-dimensional input features similar to [17], extracted with a 10 ms hop time. A reduction factor, r = 2, is used.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our proposed system uses 64-dimensional input features similar to [17], extracted with a 10 ms hop time. A reduction factor, r = 2, is used.…”
Section: Methodsmentioning
confidence: 99%
“…Non-Seq2Seq singing synthesizers include those based on autoregressive architectures [17,21,22], feed-forward CNN [23], and feed-forward GAN-based approaches [24,25].…”
Section: Relation To Prior Workmentioning
confidence: 99%
“…The system decomposes a speech signal into the fundamental frequency f 0, harmonic spectral envelope and aperiodicity envelope. It has been proved that these parameters can be used to reconstruct a high quality synthesis of speech signals, even after dimensionality reduction techniques have been applied to the parameters [15].…”
Section: World Vocodermentioning
confidence: 99%
“…It is applied to a specific piano, and the results outperform the earlier methods in note-level polyphonic piano music transcription. Blaauw and Bonada [7] describe a singing synthesizer based on deep neural networks called the Neural Parametric Singing Synthesizer (NPSS), which can generate high-quality singing when a musical score and lyrics are given as the input. The NPSS can learn the timbre and expressive features of a singer from a small set of recordings.…”
Section: Machine and Deep Learningmentioning
confidence: 99%