Proceedings of the Tenth ACM International Conference on Multimedia - MULTIMEDIA '02 2002
DOI: 10.1145/641043.641070
|View full text |Cite
|
Sign up to set email alerts
|

Assessing face and speech consistency for monologue detection in video

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2006
2006
2011
2011

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(20 citation statements)
references
References 0 publications
0
20
0
Order By: Relevance
“…Synchronization has been studied in both audio and video signals [6,1,19,17,5,7]. For instance, synchrony measures have been derived [17,7] for associating the movement of mouths to the oscillation of sound waves.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Synchronization has been studied in both audio and video signals [6,1,19,17,5,7]. For instance, synchrony measures have been derived [17,7] for associating the movement of mouths to the oscillation of sound waves.…”
Section: Literature Reviewmentioning
confidence: 99%
“…They localized the speaker in the image by computing mutual information between the signals. This method has extended and has been applied to especially sound source localization problem [14] [15]. A limitation of the method is the assumption that the target does not move in the images.…”
Section: Sensor Integration Based On Evaluating Synchrony Between mentioning
confidence: 99%
“…It affects methods based on MI as well [13]. Hence, very small images O(50 × 50) have been commonly used [3,20,23,25], out of which only a few dozen features were selected by aggressive pruning or face detection steps (the latter limiting audio analysis to speech). In contrast, we seek localization of general unknown audiovisual sources, while handling intricate details and motion.…”
Section: Canonical Correlation: Limitationsmentioning
confidence: 99%
“…A particularly interesting sensor combination involves visual motion in conjunction with associated audio. Activity in computer vision involving audio analysis has various research aspects [4,26], including lip reading [3,25], analysis and synthesis of music from motion [22], audio filtering based on motion [6], and source separation based on vision [14,17,20,23,27]. We note that physiological evidence and analysis of biological systems show that fusion of audio-visual information is used to enhance perception [9,12,16].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation