ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing
DOI: 10.1109/icassp.1985.1168283
|View full text |Cite
|
Sign up to set email alerts
|

Context-dependent modeling for acoustic-phonetic recognition of continuous speech

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
73
0
1

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 172 publications
(78 citation statements)
references
References 6 publications
0
73
0
1
Order By: Relevance
“…These augmented triphones, called "PIC"s, are the fundamental unit of the system, and are closely related to other approaches that have appeared in the literature ( [16] and [14]). The information that the PICs currently contain is the identity of the preceding and succeeding phonemes, and, optionally, an estimate of the degree of the phoneme's prepausal lengthening.…”
Section: Report Documentation Pagementioning
confidence: 99%
“…These augmented triphones, called "PIC"s, are the fundamental unit of the system, and are closely related to other approaches that have appeared in the literature ( [16] and [14]). The information that the PICs currently contain is the identity of the preceding and succeeding phonemes, and, optionally, an estimate of the degree of the phoneme's prepausal lengthening.…”
Section: Report Documentation Pagementioning
confidence: 99%
“…Given that in every step of the iteration the likeli hood evaluated at the current Lo and Qo (1:T) is increas ing(or at least equal) to the likelihood evaluated at the previous Lo and Qo (1:T), a maximum of (6) is eventually reached, which corresponds to reach a local maximum of an approximation to (1).…”
Section: Model Optimizationmentioning
confidence: 99%
“…(Cross-word triphones, which are a feature of the old TS decoder, will be implemented later.) These models are smoothed with reduced context phone models [20]. Each phone model is a three state "linear" (no skip transitions) HMM.…”
Section: The Basic Hmm Systemmentioning
confidence: 99%
“…The system uses Gaussian tied mixture [4,6] observation pdfs and treats each observation stream as if it is statistically independent of all others. Triphone models [20] are used to model phonetic coarticulation. (Cross-word triphones, which are a feature of the old TS decoder, will be implemented later.)…”
Section: The Basic Hmm Systemmentioning
confidence: 99%