2014 IEEE China Summit &Amp; International Conference on Signal and Information Processing (ChinaSIP) 2014
DOI: 10.1109/chinasip.2014.6889197
|View full text |Cite
|
Sign up to set email alerts
|

Sentence boundary detection in chinese broadcast news using conditional random fields and prosodic features

Abstract: In this paper, we explore the use of prosodic features in sentence boundary detection in Chinese broadcast news. The prosodic features include speaker turn, music, pause duration, pitch, energy and speaking rate. Specifically, considering the Chinese tonal effects in pitch trajectory, we propose to use tone-normalized pitch features. Experiments using decision trees demonstrate that the tone-normalized pitch features show superior performance in sentence boundary detection in Chinese broadcast news. Furthermor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
1
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 20 publications
0
1
0
Order By: Relevance
“…Many works (e.g., Stolcke and Shriberg, 1996, Wang and Narayanan, 2004, and Liu et al, 2006 use the Switchboard corpus to identify syntactically-based prosodic boundaries in telephone conversations between strangers, using orthographic inputs and/or manually crafted acoustic features. Xu et al (2014) applies pause, pitch, energy, and duration information to a similar task in spoken Mandarin. More recent work has pursued integrated approaches that consider Speech-To-Text (STT) transcription and segmentation simultaneously, but still have not focused on IU boundaries in conversational speech.…”
Section: Introductionmentioning
confidence: 99%
“…Many works (e.g., Stolcke and Shriberg, 1996, Wang and Narayanan, 2004, and Liu et al, 2006 use the Switchboard corpus to identify syntactically-based prosodic boundaries in telephone conversations between strangers, using orthographic inputs and/or manually crafted acoustic features. Xu et al (2014) applies pause, pitch, energy, and duration information to a similar task in spoken Mandarin. More recent work has pursued integrated approaches that consider Speech-To-Text (STT) transcription and segmentation simultaneously, but still have not focused on IU boundaries in conversational speech.…”
Section: Introductionmentioning
confidence: 99%
“…Nicola et al [2] studied various lexical features, including language model features, sentence length features and syntax features, on different genres ranging from formal newspaper text to informal, dictated messages, and from written text to spoken transcript. Recent efforts have shown that speech prosody, especially pause and pitch related features, are informative indicators for structural events [1, 3,4,5] including sentence boundaries [6,7,8,9]. Research has shown that a decision tree (DT) model learned from prosodic features can achieve comparable performance with that learned from lexical features.…”
Section: Introductionmentioning
confidence: 99%
“…Many works (e.g., Stolcke and Shriberg, 1996, Wang and Narayanan, 2004, and Liu et al, 2006 use the Switchboard corpus to identify syntactically-based prosodic boundaries in telephone conversations between strangers, using orthographic inputs and/or manually crafted acoustic features. Xu et al (2014) applies pause, pitch, energy, and duration information to a similar task in spoken Mandarin. More recent work has pursued integrated approaches that consider Speech-To-Text (STT) transcription and segmentation simultaneously, but still have not focused on IU boundaries in conversational speech.…”
mentioning
confidence: 99%