Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
DOI: 10.1109/icslp.1996.607054
|View full text |Cite
|
Sign up to set email alerts
|

Estimating the quality of phonetic transcriptions and segmentations of speech signals

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
16
0
3

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 30 publications
(22 citation statements)
references
References 3 publications
3
16
0
3
Order By: Relevance
“…These differences are also similar to those typically obtained between human transcribers. For instance, Wesenick and Kipp (1996) reported that 96% of the segment boundaries determined by three human transcribers for 64 read sentences of German differed by less than 20 ms. Similarly, Raymond and colleagues (Raymond, Pitt, Johnson, Hume, Makashay, Dautricourt, & Hilts, 2002) observed an average alignment difference of 16.4 ms between human transcribers of the Buckeye corpus.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…These differences are also similar to those typically obtained between human transcribers. For instance, Wesenick and Kipp (1996) reported that 96% of the segment boundaries determined by three human transcribers for 64 read sentences of German differed by less than 20 ms. Similarly, Raymond and colleagues (Raymond, Pitt, Johnson, Hume, Makashay, Dautricourt, & Hilts, 2002) observed an average alignment difference of 16.4 ms between human transcribers of the Buckeye corpus.…”
Section: Methodsmentioning
confidence: 99%
“…We automatically created broad phonetic transcriptions for all speech in the corpus, using the transcriber and procedure described in Schuppler et al (2011). This automatic transcriber is based on the Hidden Markov Toolkit HTK (Young et al, 2002), and received as its input the acoustic signals and the corresponding orthographic transcriptions. The recognizer selected for each word in the orthographic transcriptions the pronunciation variant that best matched the acoustic signal, choosing from a lexicon containing for every word both the full pronunciation and several reduced pronunciation variants.…”
Section: Methodsmentioning
confidence: 99%
“…These average differences are similar to those obtained in other studies. For example, Wesenick and Kipp (1996) reported that 96% of the segment boundaries determined by three human transcribers for 64 read sentences of German differ by less than 20 ms. Similarly, Raymond et al (2002) presented an average alignment difference of 16.4 ms between human transcribers of the Buckeye corpus.…”
Section: Measurementsmentioning
confidence: 99%
“…It should be noted that several studies have pointed out the high degree of inter-rater disagreement, with sometimes large discrepancies between human-made alignments [37]. Generally, results are provided within a certain tolerance threshold on the timing error.…”
Section: B Evaluation Metricsmentioning
confidence: 97%