2012
DOI: 10.1587/transinf.e95.d.1206
|View full text |Cite
|
Sign up to set email alerts
|

Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features

Abstract: SUMMARYIn this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extrac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 13 publications
(2 citation statements)
references
References 20 publications
(30 reference statements)
0
2
0
Order By: Relevance
“…Previous works on topic segmentation have adopted sequence modelling approaches based on Hidden Markov Models (HMM) [13,17,22,29], and also discriminatively trained models including Conditional Random Fields (CRF) [30], deep feed forward neural network with HMM [29] and Support Vector Machines with sliding windows [26]. However, RNNs are better than fixed size windows and HMMs at exploiting contextual information [27].…”
Section: Introductionmentioning
confidence: 99%
“…Previous works on topic segmentation have adopted sequence modelling approaches based on Hidden Markov Models (HMM) [13,17,22,29], and also discriminatively trained models including Conditional Random Fields (CRF) [30], deep feed forward neural network with HMM [29] and Support Vector Machines with sliding windows [26]. However, RNNs are better than fixed size windows and HMMs at exploiting contextual information [27].…”
Section: Introductionmentioning
confidence: 99%
“…Nicola et al [2] studied various lexical features, including language model features, sentence length features and syntax features, on different genres ranging from formal newspaper text to informal, dictated messages, and from written text to spoken transcript. Recent efforts have shown that speech prosody, especially pause and pitch related features, are informative indicators for structural events [1, 3,4,5] including sentence boundaries [6,7,8,9]. Research has shown that a decision tree (DT) model learned from prosodic features can achieve comparable performance with that learned from lexical features.…”
Section: Introductionmentioning
confidence: 99%