2013
DOI: 10.1016/j.specom.2012.08.010
|View full text |Cite
|
Sign up to set email alerts
|

Mixed source model and its adapted vocal tract filter estimate for voice transformation and synthesis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
38
0

Year Published

2013
2013
2016
2016

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 35 publications
(40 citation statements)
references
References 34 publications
1
38
0
Order By: Relevance
“…On the contrary, the impact of the glottal source on the vocal tract feature is far from negligible, which is not convenient. However, a robust separation of the vocal tract filter and the glottal source is far from straightforward [31,[41][42][43][44]. Thus, in this work, we chose to favor again robustness and simplicity, in order to focus, beforehand, on the phase features.…”
Section: A Simple Representation Of the Amplitudementioning
confidence: 99%
See 1 more Smart Citation
“…On the contrary, the impact of the glottal source on the vocal tract feature is far from negligible, which is not convenient. However, a robust separation of the vocal tract filter and the glottal source is far from straightforward [31,[41][42][43][44]. Thus, in this work, we chose to favor again robustness and simplicity, in order to focus, beforehand, on the phase features.…”
Section: A Simple Representation Of the Amplitudementioning
confidence: 99%
“…This is done through the following steps: first, aHM analysis [16] is performed to obtain the instantaneous phase from the waveform; then, the minimum-phase term is subtracted from the measured phases, and the local Phase Distortion (PD) [25] is calculated; finally, the short-time mean and standard deviation of the PD are computed in the neighborhood of each frame, the former being highly correlated to the maximum-phase component, and the latter to the degree of noisiness. Among the advantages of this novel approach, we can mention the following: (i) it is valid to analyze signals exhibiting harmonic and noise components that overlap both in time and in frequency and thus, avoiding binary voiced/unvoiced decisions which are error-prone and result in annoying artifacts, especially in synthesis [30,31]. (ii) Since it helps avoiding an explicit separation between harmonics and noise, it provides a solid and uniform framework for speech manipulation thus, avoiding artifacts near the voicing boundaries [21].…”
Section: Introductionmentioning
confidence: 99%
“…Recent generations of speech synthesizers include an explicit representation of the glottal source (e.g., LF model [24,2]) in the speech synthesizer, which substantially improves the control of the voice quality during speech synthesis. In particular, the SVLN speech synthesizer (Separation of the Vocaltract with the Liljencrants-fant model plus Noise) [2] allows the intuitive control of the glottal source (and thus, voice quality) during speech synthesis [25], with a limited number of parameters. This section summarizes the main principles of the SVLN speech synthesizer.…”
Section: Speech Analysis and Synthesismentioning
confidence: 99%
“…The estimation of the SVLN parameters is described in details in [2]. Additionally, glottal closure instants (GCI) are estimated [27], and used to measure the period-to-period regularity of the glottal pulse :…”
Section: Analysis Of Svln Parametersmentioning
confidence: 99%
See 1 more Smart Citation