2011
DOI: 10.1109/lsp.2010.2098440
|View full text |Cite
|
Sign up to set email alerts
|

Letter-to-Sound Pronunciation Prediction Using Conditional Random Fields

Abstract: Abstract-Pronunciation prediction, or letter-to-sound (LTS) conversion, is an essential task for speech synthesis, open vocabulary spoken term detection and other applications dealing with novel words. Most current approaches (at least for English) employ data-driven methods to learn and represent pronunciation "rules" using statistical models such as decision trees, hidden Markov models (HMMs) or joint-multigram models (JMMs). The LTS task remains challenging, particularly for languages with a complex relatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
29
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 36 publications
(29 citation statements)
references
References 21 publications
(18 reference statements)
0
29
0
Order By: Relevance
“…Statistical models such as decision trees, JointMultigram Model (JMM) [6], or Conditional Random Field (CRF) [7] are used to learn pronunciation rules. All these approaches invariantly assume access to "prior" linguistic resources consisting of sequences of graphemes and their corresponding sequences of phonemes [8]. Writing by hand, this rule is hard and very time consuming.…”
Section: Speech → S P Iy Chmentioning
confidence: 99%
“…Statistical models such as decision trees, JointMultigram Model (JMM) [6], or Conditional Random Field (CRF) [7] are used to learn pronunciation rules. All these approaches invariantly assume access to "prior" linguistic resources consisting of sequences of graphemes and their corresponding sequences of phonemes [8]. Writing by hand, this rule is hard and very time consuming.…”
Section: Speech → S P Iy Chmentioning
confidence: 99%
“…In general, a G2P is developed using machine learning-based methods, such as instance-based learning [1], table lookup with defaults [1], self-learning techniques [2], hidden Markov model [3], morphology and phoneme history [4], joint multigram models [5], conditional random fields [6], Kullback-Leibler divergence-based hidden Markov model [7]. These methods are commonly very complex and designed to be language independent, but they give varying performances for some phonemically complex languages, such as English, Dutch, French, and Germany.…”
Section: Introductionmentioning
confidence: 99%
“…Finally, CRFs offer several advantages. They are widely used in grapheme-to-phoneme converters [19,9,14], thus enabling an easy integration of their outputs. And they also allow to explicitly consider and combine a large set of features.…”
Section: Introductionmentioning
confidence: 99%