Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)
DOI: 10.1109/icip.2001.958129
|View full text |Cite
|
Sign up to set email alerts
|

Automatic multi-modal dialogue scene indexing

Abstract: An automatic algorithm for indexing dialogue scenes in multimedia content is proposed. The content is segmented into dialogue scenes using the state transitions of a hidden Markov model (HMM). Each shot is classified using both audio and visual information to determine the state/scene transitions for this model. Face detection and silence/speech/music classification are the basic tools which are utilized to index the scenes. While face information is extracted after applying some heuristics to skin-colored reg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…In such dialogue region, frames occur repetitively, and we could use dialogue detection to compress such dialogue frames. The method proposed in [1] is selected to detect dialogue frames.…”
Section: Dialogue Detectionmentioning
confidence: 99%
“…In such dialogue region, frames occur repetitively, and we could use dialogue detection to compress such dialogue frames. The method proposed in [1] is selected to detect dialogue frames.…”
Section: Dialogue Detectionmentioning
confidence: 99%
“…The false hit rate determines the ratio of falsely detected scene boundaries to the number of all shot boundaries. Finally, Alatan et al [2,3,4] employ the shot accuracy measure, which is defined as the ratio of correct shot assignments to the total number of shots.…”
Section: Figures Of Merit and Movie Datasetsmentioning
confidence: 99%
“…Some methods are extensions of those described in Section 2.4, incorporating the information contained in both the video and the audio channels. The techniques for audiovisual dialogue and action scene detection are classified as deterministic [10,26,29,53] and probabilistic [4,2,3,28,50], like in Section 2.4. While the deterministic methods usually cluster consecutive shots by utilizing appropriate measures, most probabilistic approaches use HMMs representing the semantic events in their states.…”
Section: Audiovisual Dialogue and Action Scene Detectionmentioning
confidence: 99%
See 1 more Smart Citation