Multi speaker detection and tracking using audio and video sensors with gesture analysis

Hariharan, Balaji; Hari, S. Sri; Gopalakrishnan, Uma

doi:10.1109/wocn.2013.6616222

Search citation statements

Order By: Relevance

Paper Sections

Select...

Introduction1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2015

2023

Publication Types

Select...

Other3

Article1

Relationship

Self Cite0

Independent4

Authors

Journals

Cited by 4 publications

(1 citation statement)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The detection of active speakers has been used for automatic editing of classroom video. Hariharan et al [5] use a microphone array to localize a questioner with a Time Difference of Arrival (TDOA) algorithm, and combine this with video detection of the raised hand of the questioner.…”

Section: Introductionmentioning

confidence: 99%

Who's Speaking?

Chakravarty

Mirzaei

Tuytelaars

et al. 2015

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

View full text Add to dashboard Cite

Active speakers have traditionally been identified in video by detecting their moving lips. This paper demonstrates the same using spatio-temporal features that aim to capture other cues: movement of the head, upper body and hands of active speakers. Speaker directional information, obtained using sound source localization from a microphone array is used to supervise the training of these video features.

show abstract

Section: Introductionmentioning

confidence: 99%