2008
DOI: 10.1121/1.2935816
|View full text |Cite
|
Sign up to set email alerts
|

Transforming modal voice into irregular voice by amplitude scaling of individual glottal cycles

Abstract: Irregular phonation can serve as a cue to segmental contrasts and prosodic structure as well as to the affective state and identity of the speaker. Thus algorithms for transforming between voice qualities, such as regular and irregular phonation, may contribute to building more natural sounding, expressive and personalized speech synthesizers. We describe a semiautomatic transformation method that introduces irregular pitch periods into a modal speech signal by amplitude scaling of the individual cycles. First… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2008
2008
2020
2020

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 11 publications
0
4
0
Order By: Relevance
“…In theory, analysis-resynthesis techniques can be used to, first, estimate the recording’s series of glottal pulses (Degottex, Lanchantin, Roebel, & Rodet, 2013) and then, resynthesize the vocal signal from a manipulated series of pulses with artificially varied amplitude and period (Bõhm, Audibert, Shattuck-Hufnagel, Németh, & Aubergé, 2008; Ruinskiy & Lavner, 2008; Verma & Kumar, 2005). However, these techniques rely on an explicit model of pulse variability, which is typically learned from one or several target examples of naturally rough voices (Bonada & Blaauw, 2013), and it is unclear how such predetermined patterns should be selected for arbitrary voices.…”
Section: Voice Transformations Along the Vocal Production Pathwaymentioning
confidence: 99%
“…In theory, analysis-resynthesis techniques can be used to, first, estimate the recording’s series of glottal pulses (Degottex, Lanchantin, Roebel, & Rodet, 2013) and then, resynthesize the vocal signal from a manipulated series of pulses with artificially varied amplitude and period (Bõhm, Audibert, Shattuck-Hufnagel, Németh, & Aubergé, 2008; Ruinskiy & Lavner, 2008; Verma & Kumar, 2005). However, these techniques rely on an explicit model of pulse variability, which is typically learned from one or several target examples of naturally rough voices (Bonada & Blaauw, 2013), and it is unclear how such predetermined patterns should be selected for arbitrary voices.…”
Section: Voice Transformations Along the Vocal Production Pathwaymentioning
confidence: 99%
“…One major difference between the transformation approach of ANGUS and the analysis/resynthesis approach of the CONTROl algorithm used here, as well as other approaches for jitter modeling in the literature [20][21][22][23], is that ANGUS makes no other assumption on the input signal as its having identifiable f 0 . Thus ANGUS can be applied to non-human vocal sounds, such as animal vocalizations, as well as non-vocal sounds such as musical instruments or alarms.…”
Section: Study 4: Effect On Non-vocal Stimulimentioning
confidence: 99%
“…dynamically modulate the arousal of video game character's scream, add growl on a singer's voice) or voices synthesized with methods such as concatenative text-to-speech [18] or sample-based deep network methods [19]. Adding vocal non-linearities on such signals can be done with analysis-resynthesis techniques that, first, estimate the recording's series of glottal pulses [20] and, then, resynthesize the vocal signal from a manipulated series of pulses with artificially-varied amplitude and period [21][22][23][24]. However, these techniques rely on an explicit model of pulse variability, which is typically learned from one or several target examples of naturally rough voices [25], and it is unclear how such predetermined patterns should be selected for unknown voices.…”
Section: Introductionmentioning
confidence: 99%
“…In order to make the last portion of a modal recording sound irregular, some of the pitch periods were zeroed out and some others were either attenuated or boosted. This transformation method is described in detail in Bőhm et al [2008] and is briefly explained here. As a first step, the glottal pulses or glottal closure instants were estimated by Praat [Boersma and Weenink, 2006] and hand-corrected by the first author.…”
Section: Stimulimentioning
confidence: 99%