Prediction of creaky voice from contextual factors

Drugman, Thomas; Kane, John; Raitio, Tuomo; Gobl, Christer

doi:10.1109/icassp.2013.6639216

Cited by 3 publications

(7 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Creaky voice has been studied in relation to various functions in speech communication, and most commonly with phrase or sentence boundary marking (Surana and Slifka, 2006b;Drugman et al, 2013). Similarly creaky voice has been associated with turn-yielding in Finnish (Ogden, 2001).…”

Section: Creaky Voice In Speech Communicationmentioning

confidence: 99%

Data-driven detection and analysis of the patterns of creaky voice

Drugman

Kane

Gobl

2014

Computer Speech & Language

Self Cite

View full text Add to dashboard Cite

This paper investigates the temporal excitation patterns of creaky voice. Creaky voice is a voice quality frequently used as a phrase-boundary marker, but also as a means of portraying attitude, affective states and even social status. Consequently, the automatic detection and modelling of creaky voice may have implications for speech technology applications. The acoustic characteristics of creaky voice are, however, rather distinct from modal phonation. Further, several acoustic patterns can bring about the perception of creaky voice, thereby complicating the strategies used for its automatic detection, analysis and modelling. The present study is carried out using a variety of languages, speakers, and on both read and conversational data and involves a mutual information-based assessment of the various acoustic features proposed in the literature for detecting creaky voice. These features are then exploited in classification experiments where we achieve an appreciable improvement in detection accuracy compared to the state of the art. Both experiments clearly highlight the presence of several creaky patterns. A subsequent qualitative and quantitative analysis of the identified patterns is provided, which reveals a considerable speaker-dependent variability in the usage of these creaky patterns. We also investigate how creaky voice detection systems perform across creaky patterns.

show abstract

Section: Creaky Voice In Speech Communicationmentioning

confidence: 99%

Data-driven detection and analysis of the patterns of creaky voice

Drugman

Kane

Gobl

2014

Computer Speech & Language

Self Cite

View full text Add to dashboard Cite

show abstract

“…Synthesis of voice with creak requires i) the prediction of creaky parts from context and ii) the ability to render creaky excitation. In our previous work, we have developed methods for creaky voice prediction from context [13] and rendering of creaky excitation [12]. However, these methods have not been utilised in a full TTS voice before.…”

Section: Synthesis Of Creaky Voicementioning

confidence: 99%

“…In this study, the algorithm in [5] is used, which provides a frame-wise probability of creak. This parameter is used as a feature in the HMMtraining for determining if a segment is creaky or not [13]. More specifically, the parameter indicating the probability of creak is trained as an additional 1-dimensional feature along with other speech features, such as F0 and spectrum.…”

Section: Prediction Of Creaky Voice From Contextmentioning

confidence: 99%

“…For the Finnish speaker, MV, a total of 66 contextual factors are used, described in [26]. According to the study in [13], only a few of the contextual factors are useful in predicting creaky voice, and the useful factors are closely related with creaky use at the end of a sentence or a word group.…”

Section: Prediction Of Creaky Voice From Contextmentioning

confidence: 99%

“…Further work by the present authors was concerned with developing an excitation model of creaky production capable of providing a natural rendering of the voice quality [12]. Also the prediction of creaky voice from contextual factors was investigated in [13], which enables automatic determination of the creaky usage from the input text. One obvious application of this line of research is incorporating creaky voice in a statistical parametric speech synthesis system.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

HMM-based synthesis of creaky voice

et al. 2013

Self Cite

View full text Add to dashboard Cite

Creaky voice, also referred to as vocal fry, is a voice quality frequently produced in many languages, in both read and conversational speech. To enhance the naturalness of speech synthesis, these latter should be able to generate speech in all its expressive diversity, including creaky voice. The present study looks to exploit our recent developments, including creaky voice detection, prediction of creaky voice from context, and rendering of the creaky excitation, into a fully functioning and automatic HMM-based synthesis system. HMM-based synthetic creaky voices are built and evaluated in subjective listening tests, which show that the best synthetic creaky voices are rated more natural and more creaky compared to a conventional voice. A noncreaky voice is also successfully transformed to use creak by modifying the F0 contour and excitation of the predicted creaky parts. The transformed voice is rated equal in terms of naturalness and clearly more creaky compared to the original voice.

show abstract

Modeling Irregular Voice in Statistical Parametric Speech Synthesis With Residual Codebook Based Excitation

Csapó

Németh

2014

IEEE J. Sel. Top. Signal Process.

View full text Add to dashboard Cite

Prediction of creaky voice from contextual factors

Cited by 3 publications

References 15 publications

Data-driven detection and analysis of the patterns of creaky voice

Data-driven detection and analysis of the patterns of creaky voice

HMM-based synthesis of creaky voice

Modeling Irregular Voice in Statistical Parametric Speech Synthesis With Residual Codebook Based Excitation

Contact Info

Product

Resources

About