Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181
DOI: 10.1109/icassp.1998.674429
|View full text |Cite
|
Sign up to set email alerts
|

Acoustics-only based automatic phonetic baseform generation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(9 citation statements)
references
References 7 publications
0
9
0
Order By: Relevance
“…guerilla g ax r ax l ax g ax r ih l ax guerilla g w eh r ih l ax tornados t er n ey d ow z t er n ey d ow z tornados t ao r n ey d ow s t ow r n ey d ow z phone network to select the pronunciation. Additional resources are typically required, including existing pronunciation lexica [14], speech samples [19], [20], linguistic rules [21], or a combination of these. The focus of previous work has been on pronunciation variation [14], [19] or on common words [15], [17].…”
Section: B Methods For Selecting a Pronunciation Modelmentioning
confidence: 99%
“…guerilla g ax r ax l ax g ax r ih l ax guerilla g w eh r ih l ax tornados t er n ey d ow z t er n ey d ow z tornados t ao r n ey d ow s t ow r n ey d ow z phone network to select the pronunciation. Additional resources are typically required, including existing pronunciation lexica [14], speech samples [19], [20], linguistic rules [21], or a combination of these. The focus of previous work has been on pronunciation variation [14], [19] or on common words [15], [17].…”
Section: B Methods For Selecting a Pronunciation Modelmentioning
confidence: 99%
“…Many G2P systems are presented in the literature. Several names are attributed to this task: grapheme-to-phoneme conversion [24,17], phonetic pronunciation modeling [25], letter-to-sound translation [26], letter-to-phoneme conversion [27,7], phonetic baseform generation [28,29], phonetic transcription [30], text-tophoneme mapping [31], among others.…”
Section: Related Workmentioning
confidence: 99%
“…In standard procedures like those presented in [7] and [8] for example, the speech utterance of the new word is aligned with speaker-independent allophone models. Effectively a decoding from utterance to allophone sequence is performed, with a bigram model on allophones functioning as the language model on words does in a normal utterance-to-text decoding; we will refer to this as the transition model.…”
Section: A Dynamic Vocabularymentioning
confidence: 99%
“…As in a typical speech-to-text decoding, a weighted combination of acoustic model and transition model logprobabilities is used to determine the best allophone sequence. However, it differs from the earlier approaches of [7] and [8] in that the speaker-independent allophone models and the transition model are assigned a weight The advantage of this approach is twofold. First, since we have to deduce the pronunciation of the enrolled words from just one or two speech examples, we may as well use multiple guesses to maximize the chance that one of them will be right.…”
Section: B Automatic Generation Of Multiple Pronunciationsmentioning
confidence: 99%