Automatic syllable detection is an important t ask when analysing v ery large speech corpora in order to answer questions concerning prosody, r h ythm, speech r a t e, speech recognition and synthesis. In this paper a new method for automatic detection of syllable nuclei is presented. Two large spoken language corpora (PhonDatII, Verbmobil) were labelled by t hree phoneticians and t h en used to adjust the k ey parameters of the algorithm and to evaluate its error rate. Additionally, parts o f t h e corpora were used to t est the i n t er-and i n traindividual consistency of the transcribers. The e v aluation of the algorithm currently shows an error rate of 12.87% for read speech a n d 21.03% for spontaneous speech. The i n t erindividual consistency of 95.8% might be considered as an upper limit for any a u t omatic detection method.
This paper describes the approach to the assessment of the naturalness of synthetic speech taken within the ProSynth collaborative speech synthesis project. The view expressed is that an important aspect of naturalness is ease of understanding, and the consequences are that this leads to a means for the evaluation of scientific hypotheses through perceptual tests. The premise within ProSynth is that listeners' processing of synthetic speech will be faster and more accurate when the signal includes phonetic fine detail that systematically varies with the linguistic structure. Four perceptual experiments are outlined which demonstrate both the application of the approach and the effectiveness of the basic principle.
Automatic syllable detection is an important t ask when analysing v ery large speech corpora in order to answer questions concerning prosody, r h ythm, speech r a t e, speech recognition and synthesis. In this paper a new method for automatic detection of syllable nuclei is presented. Two large spoken language corpora (PhonDatII, Verbmobil) were labelled by t hree phoneticians and t h en used to adjust the k ey parameters of the algorithm and to evaluate its error rate. Additionally, parts o f t h e corpora were used to t est the i n t er-and i n traindividual consistency of the transcribers. The e v aluation of the algorithm currently shows an error rate of 12.87% for read speech a n d 21.03% for spontaneous speech. The i n t erindividual consistency of 95.8% might be considered as an upper limit for any a u t omatic detection method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.