Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
DOI: 10.1109/icslp.1996.607890
|View full text |Cite
|
Sign up to set email alerts
|

Automatic text-independent pronunciation scoring of foreign language student speech

Abstract: SRI International is currently involved in the development of a new generation of software systems for automatic scoring of pronunciation as part of the Voice Interactive Language Training System (VILTS) project. This paper describes the goals of the VILTS system, the speech corpus, and the algorithm development. The automatic grading system uses SRI's Decipher™ continuous speech recognition system [1] to generate phonetic segmentations that are used to produce pronunciation scores at the end of each lesson. T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
68
0
5

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 105 publications
(73 citation statements)
references
References 2 publications
0
68
0
5
Order By: Relevance
“…The test result is given in Table 3 Fig. 4 Comparison according to feature incorporations. where the performance is compared with that obtained when the exact transcript was provided.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The test result is given in Table 3 Fig. 4 Comparison according to feature incorporations. where the performance is compared with that obtained when the exact transcript was provided.…”
Section: Resultsmentioning
confidence: 99%
“…Since, however, the duration of a phone is affected by the learner's mother tongue and speaking rate, it needs to be normalized. In order to normalize the phone duration, we apply a measure of rate of speech (ROS), which indicates the average number of phones uttered by the test speaker per a unit time [4]. Let d(i) denote the normalized duration of the i-th phone segment.…”
Section: Durationmentioning
confidence: 99%
“…The disadvantage of these methods is that they are text-dependent, so they only work for the utterances with the same text of the native recordings, but can not be used on other utterances. Neumeyer et al presented a textindependent pronunciation assessment framework [11] in 1996, then they improved the method by using the posterior probabilities instead of decoding log-likelihood [12], [13]. Witt et al combined the advantages of these works and presented the GOP method.…”
Section: Introductionmentioning
confidence: 99%
“…The duration parameter is then normalised by considering the mean duration of the syllable nuclei in the utterance. This is a standard technique for Rate-Of-Speech (ROS) normalisation, described, for example, in Neumeyer (1996) and Venkata Ramana (2000).…”
Section: Durationmentioning
confidence: 99%