Automatic sentence boundary detection in conversational speech: A cross-lingual evaluation on English and Czech

Kolář, Jáchym; Liu, Yang

doi:10.1109/icassp.2010.5494976

Cited by 9 publications

(4 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This work aims at classifying metadata structures based on the following set of very informative features derived from the literature: pause at the boundary, pitch declination over the sentence, post-boundary pitch and energy resets, pre-boundary lengthening, word duration, silent pauses, filled pauses, and presence of fragments. With this work we hope to contribute to the discussion of what are language and domain dependent effects in structural metadata evaluation (Kolár, Liu, & Shriberg, 2009;Kolár & Liu, 2010;Ostendorf et al, 2008;Shriberg et al, 2009).…”

Section: Related Workmentioning

confidence: 99%

Towards automatic language processing and intonational labeling in European Portuguese

Moniz

Batista

Mata

et al. 2016

Intonational Grammar in Ibero-Romance

View full text Add to dashboard Cite

This work describes a framework that encompasses multi-layered linguistic information, focusing on prosodic features (pitch, energy, and tempo patterns), uses such features to distinguish between sentence-form types and disfluency/fluency repairs, and contributes to the characterization of intonational patterns of spontaneous and prepared speech in European Portuguese. Different machine learning methods have been applied for discriminating between structural metadata events, both in university lectures and in map-task dialogues, containing large amounts of spontaneous speech. Results show that prosodic features, and particularly a set of very informative features, are crucial to distinguish between sentence-form types and disfluency/fluency repair events. This is the first work for European Portuguese on both fully automatic processing of multilayered linguistically description of spoken corpora and intonational labeling.

show abstract

Section: Related Workmentioning

confidence: 99%

Towards automatic language processing and intonational labeling in European Portuguese

Moniz

Batista

Mata

et al. 2016

Intonational Grammar in Ibero-Romance

View full text Add to dashboard Cite

show abstract

“…The proposed system provides an absolute word error rate (WER) reduction of 8.7%. Kolar and Liu [164] combine three statistical models-HMM, max-imum entropy, and a boosting-based model BoosTexter. The result revealed that superior outcomes are achieved when all the three models are combined through posterior probability interpolation.…”

Section: Czechmentioning

confidence: 99%

Computational intelligence in processing of speech acoustics: a survey

Singh

Kaur

Kukreja

et al. 2022

Complex Intell. Syst.

View full text Add to dashboard Cite

Speech recognition of a language is a key area in the field of pattern recognition. This paper presents a comprehensive survey on the speech recognition techniques for non-Indian and Indian languages, and compiled some of the computational models used for processing speech acoustics. An immense number of frameworks are available for speech processing and recognition for languages persisting around the globe. However, a limited number of automatic speech recognition systems are available for commercial use. The gap between the languages being spoken around the globe and the technical support available to these languages are very few. This paper examined major challenges for speech recognition for different languages. Analysis of the literature shows that lack of standard databases availability of minority languages hinder the research recognition research across the globe. When compared with non-Indian languages, the research on speech recognition of Indian languages (except Hindi) has not achieved the expected milestone yet. Combination of MFCC and DNN–HMM classifier is most commonly used system for developing ASR minority languages, whereas in some of the majority languages, researchers are using much advance algorithms of DNN. It has also been observed that the research in this field is quite thin and still more research needs to be carried out, particularly in the case of minority languages.

show abstract

“…Provided extensive training, language models (independent of prosody) can identify boundaries with F-scores of 0.70-0.75 [68,79]. Finally, language models and acoustic cues were successfully combined to identify full stops in spontaneous speech [41,79,80]. However, as compared to phrases, sentences often terminate more prominently and the smaller sentence to word ratio (large number of TN) bolsters the accuracy metric.…”

Section: Plos Onementioning

confidence: 99%

Automatic detection of prosodic boundaries in spontaneous speech

et al. 2021

View full text Add to dashboard Cite

Automatic speech recognition (ASR) and natural language processing (NLP) are expected to benefit from an effective, simple, and reliable method to automatically parse conversational speech. The ability to parse conversational speech depends crucially on the ability to identify boundaries between prosodic phrases. This is done naturally by the human ear, yet has proved surprisingly difficult to achieve reliably and simply in an automatic manner. Efforts to date have focused on detecting phrase boundaries using a variety of linguistic and acoustic cues. We propose a method which does not require model training and utilizes two prosodic cues that are based on ASR output. Boundaries are identified using discontinuities in speech rate (pre-boundary lengthening and phrase-initial acceleration) and silent pauses. The resulting phrases preserve syntactic validity, exhibit pitch reset, and compare well with manual tagging of prosodic boundaries. Collectively, our findings support the notion of prosodic phrases that represent coherent patterns across textual and acoustic parameters.

show abstract

Automatic sentence boundary detection in conversational speech: A cross-lingual evaluation on English and Czech

Cited by 9 publications

References 12 publications

Towards automatic language processing and intonational labeling in European Portuguese

Towards automatic language processing and intonational labeling in European Portuguese

Computational intelligence in processing of speech acoustics: a survey

Automatic detection of prosodic boundaries in spontaneous speech

Contact Info

Product

Resources

About