On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling

Hämäläinen, Annika; Boves, L.W.J.; Veth, Johan de; Bosch, L.F.M. ten

doi:10.1155/2007/46460

Cited by 6 publications

(19 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To alleviate the problems of the 'beads on a string' representation of speech, several authors propose modelling the spectral and temporal variation in speech 'implicitly' by using longerlength linguistic units as the basic building blocks of speech (Ganapathiraju et al, 2001;Hämäläinen et al, 2007a;Jones et al, 1997;Jouvet and Messina, 2004;Plannerer and Ruske, 1992;. For various reasons, most of these authors (Ganapathiraju et al, 2001;Hämäläinen et al, 2007a;Jones et al, 1997;Jouvet and Messina, 2004; suggest using syllable-length models.…”

Section: Introductionmentioning

confidence: 99%

“…For various reasons, most of these authors (Ganapathiraju et al, 2001;Hämäläinen et al, 2007a;Jones et al, 1997;Jouvet and Messina, 2004; suggest using syllable-length models. First, using syllables allows for a relatively compact representation of speech, while maintaining a manageable level of recogniser complexity.…”

Section: Introductionmentioning

confidence: 99%

“…First, syllable models with a sufficient amount of training data are used in combination with triphones (Ganapathiraju et al, 2001;Hämäläinen et al, 2007a;Jouvet and Messina, 2004;. In other words, triphones are backed off to when a given syllable does not occur frequently enough for reliable model parameter estimation.…”

Section: Introductionmentioning

confidence: 99%

“…Second, to ensure that a relatively small amount of training data is sufficient, the syllable models are cleverly initialised (Hämäläinen et al, 2007a;Jouvet and Messina, 2004;. , for instance, suggest initialising the single-path syllable models with the parameters of the biphones and triphones underlying the canonical transcription of the syllables (see Figure 1).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Modelling pronunciation variation with single-path and multi-path syllable models: Issues to consider

Hämäläinen

Bosch

Boves

2009

Speech Communication

Self Cite

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Modelling pronunciation variation with single-path and multi-path syllable models: Issues to consider

Hämäläinen

Bosch

Boves

2009

Speech Communication

Self Cite

View full text Add to dashboard Cite

show abstract

“…Other projects of IMIX were responsible for question answering (van den Bosch et al, 2004;Tjong Kim Sang et al, 2005;Bouma et al, 2007), dialog and action management (op den Akker et al, 2005), speech synthesis (Marsi, 2004), and speech recognition (Hämäläinen et al, 2007). Work in this thesis contributed to the answer presentation module of IMIX.…”

Section: Imixmentioning

confidence: 99%

Discourse oriented summarization

Bosma¹

View full text Add to dashboard Cite

Automatic prosodic tone choice classification with Brazil’s intonation model

Johnson

Kang

2015

Int J Speech Technol

View full text Add to dashboard Cite

This paper examines the performance of automatically classifying five tone choices (i.e., falling, rising, rising-falling, falling-rising, and neutral) of Brazil's intonation model. We tested two machine learning classifiers (neural network and boosting ensemble) in two configurations (multi-class and pairwise coupling) and a rule-based classifier. Three sets of acoustic features built from the TILT and Bézier pitch contour models and a new four-point pitch contour model we introduced here were investigated. Tone choices are one of the key elements of Brazil's prosodic intonation model. We found the rule-based classifier, which was built on our four-point model, achieved better results than the others with an accuracy of 75.1 % and a Cohen's kappa coefficient of 0.73. This research proves that it is possible to classify tone choices with an accuracy reaching close to the percentage of agreement between two human analysts. The findings further concluded that our four-point model was better for classifying Brazil's tone choices than both of the TILT or Bézier models.

show abstract

On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling

Cited by 6 publications

References 12 publications

Modelling pronunciation variation with single-path and multi-path syllable models: Issues to consider

Modelling pronunciation variation with single-path and multi-path syllable models: Issues to consider

Discourse oriented summarization

Automatic prosodic tone choice classification with Brazil’s intonation model

Contact Info

Product

Resources

About