2006
DOI: 10.1007/11677482_5
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Integration for Meeting Group Action Segmentation and Recognition

Abstract: We address the problem of segmentation and recognition of sequences of multimodal human interactions in meetings. These interactions can be seen as a rough structure of a meeting, and can be used either as input for a meeting browser or as a first step towards a higher semantic analysis of the meeting. A common lexicon of multimodal group meeting actions, a shared meeting data set, and a common evaluation procedure enable us to compare the different approaches. We compare three different multimodal feature set… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2006
2006
2011
2011

Publication Types

Select...
3
3
2

Relationship

4
4

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 13 publications
0
11
0
Order By: Relevance
“…The prosodic features that we used in the generative model (with the exception of F0 variance) were discretised and used in conjunction with the lexical information during the CRF re-labeling process, implemented with CRF++ 4 . Table VI reports the recognition performances after discriminative re-classification.…”
Section: Discriminative Re-classification Of Joint Recognition Oumentioning
confidence: 99%
See 1 more Smart Citation
“…The prosodic features that we used in the generative model (with the exception of F0 variance) were discretised and used in conjunction with the lexical information during the CRF re-labeling process, implemented with CRF++ 4 . Table VI reports the recognition performances after discriminative re-classification.…”
Section: Discriminative Re-classification Of Joint Recognition Oumentioning
confidence: 99%
“…Example applications have included automatic summarisation [1], topic segmentation and labelling [2], [3], group action detection [4], [5], [6], participant influence [7], and dialog structure annotation [8]. The reliable recognition of the DA sequence in a meeting, and the resulting knowledge of the discourse structure, plays an important role in the development of such applications.…”
mentioning
confidence: 99%
“…In this work, the sub-actions have no obvious interpretation, and their number is a model parameter learned during training or set by hand, which makes the structure of the models more difficult to interpret. An initial comparison of various recognition models on the same task, including the layered HMM, the multilevel DBN, and other approaches, was presented by Al-Hames et al [1].…”
Section: Turn-taking Patternsmentioning
confidence: 99%
“…Third, the best I-HMM model is the asynchronous HMM (a model that explicitly accounts for variations of alignment between two data streams), which suggests that some asynchrony exists for the defined group actions, and that such asynchrony is reasonably captured by the model. A recent comparison of the layered HMM and other models on the same task appears in [1]. …”
Section: Modeling Group Interaction With Layersmentioning
confidence: 99%
“…testify. Very broadly speaking, one can identify three different -fortunately not exclusive-positions regarding what is relevant about meeting collections: (1) what they are; (2) what can be done with them; and (3) what needs to be done with them. The first view acknowledges that meetings by themselves are relevant insofar as they constitute an expression of human interaction, the field of study of more than one branch of science [6,25].…”
Section: Introductionmentioning
confidence: 99%