Proceedings of the 10th International Conference on Multimodal Interfaces 2008
DOI: 10.1145/1452392.1452446
|View full text |Cite
|
Sign up to set email alerts
|

A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0
1

Year Published

2009
2009
2020
2020

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 71 publications
(24 citation statements)
references
References 16 publications
0
23
0
1
Order By: Relevance
“…These could be remote meeting participants in a teleconference situation or users of meeting archive systems ( [28], [29], [30]). Because of the real-time constraint the most challenging is the use of these technologies by remote participants in an ongoing meeting.…”
Section: Discussionmentioning
confidence: 99%
“…These could be remote meeting participants in a teleconference situation or users of meeting archive systems ( [28], [29], [30]). Because of the real-time constraint the most challenging is the use of these technologies by remote participants in an ongoing meeting.…”
Section: Discussionmentioning
confidence: 99%
“…The resolution is low and does not allow the analysis of fine details of participants' movements. In (Otsuka et al, 2008), two omnidirectional cameras with fish eye lenses are used. The system provides high resolution and 30 fps frame rate.…”
Section: The Emergent Leadership Synchronized Corpusmentioning
confidence: 99%
“…The orchestration engine produces then an orchestrated video chat by choosing at each point in time the perspective that best represents the social interaction based on decision-level rulebased fusion. In this context, TA2 presents several challenges: the results need to be computed in real-time with low affordable delay from spatially separated sensors (as opposed to other systems, such as [5,6,7], relying on collocated sensors) in open, unconstrained environment. Furthermore, the results are supposed to be localised in the image space to allow for a dynamic and seamless orchestrated video chat.…”
Section: Introductionmentioning
confidence: 99%