2016
DOI: 10.1109/taslp.2016.2522655
|View full text |Cite
|
Sign up to set email alerts
|

Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0
2

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
2
2

Relationship

3
7

Authors

Journals

citations
Cited by 62 publications
(39 citation statements)
references
References 41 publications
0
36
0
2
Order By: Relevance
“…In particular, the common issue of oversmoothing is typically not reflected in these metrics. A recently proposed metric, the Modulation Spectrum (MS) [40], allows visualizing the spectral content of predicted time sequences. For instance, showing oversmoothing as a rolloff of higher modulation frequencies.…”
Section: • Modulation Spectrum (Ms) For Mel-generalized Coefficients mentioning
confidence: 99%
“…In particular, the common issue of oversmoothing is typically not reflected in these metrics. A recently proposed metric, the Modulation Spectrum (MS) [40], allows visualizing the spectral content of predicted time sequences. For instance, showing oversmoothing as a rolloff of higher modulation frequencies.…”
Section: • Modulation Spectrum (Ms) For Mel-generalized Coefficients mentioning
confidence: 99%
“…However, this optimization framework often causes excessively smoothed speech parameters, making the converted speech sound muffled. To address this oversmoothing problem, there are several methods have been proposed, e.g., 1) a method to model additional features to sensitively capture the oversmoothing effect, such as global variance (GV) [27] and modulation spectrum (MS) [36], 2) a method to keep characteristics of natural speech parameters by partially using the source speech parameters, such as dynamic frequency warping (DFW) [37], and 3) a method to alleviate the averaging process to implement a sparse constraint as in the exemplar-based conversion [33,34].…”
Section: Conversion Functionmentioning
confidence: 99%
“…The over-smoothing effect is an issue in not only VC but also other speech synthesis techniques, such as text-to-speech synthesis. Hence, several approaches have been devised to reproduce the characteristics of natural speech [2], [6], [7]. On the other hand, VC can utilize not only those approaches, but also input speech information since the input and output parameters are often in the same domain (e.g., cepstrum).…”
Section: Introductionmentioning
confidence: 99%