2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010
DOI: 10.1109/icassp.2010.5494976
|View full text |Cite
|
Sign up to set email alerts
|

Automatic sentence boundary detection in conversational speech: A cross-lingual evaluation on English and Czech

Abstract: Automatic sentence segmentation of speech is important for enriching speech recognition output and aiding downstream language processing. This paper focuses on automatic sentence segmentation of speech in two different languages -English and Czech. For this task, we compare and combine three statistical models -HMM, maximum entropy, and a boosting-based model BoosTexter. All these approaches rely on both textual and prosodic information. We evaluate these methods on a corpus of multiparty meetings in English, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 12 publications
0
4
0
Order By: Relevance
“…This work aims at classifying metadata structures based on the following set of very informative features derived from the literature: pause at the boundary, pitch declination over the sentence, post-boundary pitch and energy resets, pre-boundary lengthening, word duration, silent pauses, filled pauses, and presence of fragments. With this work we hope to contribute to the discussion of what are language and domain dependent effects in structural metadata evaluation (Kolár, Liu, & Shriberg, 2009;Kolár & Liu, 2010;Ostendorf et al, 2008;Shriberg et al, 2009).…”
Section: Related Workmentioning
confidence: 99%
“…This work aims at classifying metadata structures based on the following set of very informative features derived from the literature: pause at the boundary, pitch declination over the sentence, post-boundary pitch and energy resets, pre-boundary lengthening, word duration, silent pauses, filled pauses, and presence of fragments. With this work we hope to contribute to the discussion of what are language and domain dependent effects in structural metadata evaluation (Kolár, Liu, & Shriberg, 2009;Kolár & Liu, 2010;Ostendorf et al, 2008;Shriberg et al, 2009).…”
Section: Related Workmentioning
confidence: 99%
“…The proposed system provides an absolute word error rate (WER) reduction of 8.7%. Kolar and Liu [164] combine three statistical models-HMM, max-imum entropy, and a boosting-based model BoosTexter. The result revealed that superior outcomes are achieved when all the three models are combined through posterior probability interpolation.…”
Section: Czechmentioning
confidence: 99%
“…Provided extensive training, language models (independent of prosody) can identify boundaries with F-scores of 0.70-0.75 [68,79]. Finally, language models and acoustic cues were successfully combined to identify full stops in spontaneous speech [41,79,80]. However, as compared to phrases, sentences often terminate more prominently and the smaller sentence to word ratio (large number of TN) bolsters the accuracy metric.…”
Section: Plos Onementioning
confidence: 99%