IEEE International Conference on Acoustics Speech and Signal Processing 1993
DOI: 10.1109/icassp.1993.319323
|View full text |Cite
|
Sign up to set email alerts
|

Automatic language identification using Gaussian mixture and hidden Markov models

Abstract: Ergodic, continuous-observation, hidden Markov models (HMMs) were used to perform automatic language classification and detection of speech messages. State observation probability densities were modeled as tied Gaussian mixtures. The algorithm was evaluated on four multilanguage speech databases: a three language subset of the Spoken Language Library, a three language subset of a five language Rome Laboratory database, the 20 language CCITT database, and the ten language OGI telephone speech database. Generall… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
38
1

Year Published

1994
1994
2017
2017

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 77 publications
(40 citation statements)
references
References 12 publications
1
38
1
Order By: Relevance
“…Empirical evidence suggests that frequency play an important role in speech analysis. After extensive experimental analysis feature extraction technique formulated for the sampling of frequency criteria [20]. Table 3 below shows the standard frequency range of different age group.…”
Section: Frequency Characteristicsmentioning
confidence: 99%
“…Empirical evidence suggests that frequency play an important role in speech analysis. After extensive experimental analysis feature extraction technique formulated for the sampling of frequency criteria [20]. Table 3 below shows the standard frequency range of different age group.…”
Section: Frequency Characteristicsmentioning
confidence: 99%
“…In a generalization tors computed from a new utterance are compared to each of this vector quantization approach to LID, Riek [40], of the language-dependent models. The likelihood that Nakagawa [37] and Zissman [49] applied Gaussian mixthe new utterance was spoken in the same language as ture classifiers to language identification. Here, each feathe speech used to train each model is computed and the ture vector is assumed to be drawn randomly according maximum-likelihood model is found.…”
Section: Language Identification Systemsmentioning
confidence: 99%
“…HMM-based language identification phonetic transcription (sequence of symbols representing was first proposed by House and Neuburg [17]. Savic the spoken sounds), or (2) an orthographic transcription [41], Riek [40], Nakagawa [37], and Zissman [49] all (the text of the words spoken) along with a phonemic applied HMMs to spectral and cepstral feature vectors. transcription dictionary (mapping of words to prototypiIn these systems, HMM training was performed on unlacal pronunciation) for each training utterance.…”
Section: Language Identification Systemsmentioning
confidence: 99%
“…The acoustic features extracted from the audio signal are 12 MFCC plus delta, resulting in a 24-dimensional vector. The models used are Gaussian Mixture Models (as in [10]), learnt with the classic VQ and EM algorithms. As explained above, the PRLM system is based on a single Portuguese phone-recognizer.…”
Section: Acoustic Systemmentioning
confidence: 99%