2006
DOI: 10.1007/11677482_11
|View full text |Cite
|
Sign up to set email alerts
|

The “FAME” Interactive Space

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2006
2006
2023
2023

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 7 publications
0
9
0
Order By: Relevance
“…Our speech detection system exposes performances that make it suitable for our projects and goals such as CHIL [11] or [12]. Actually, we do not want to use it for automatic speech recognition or diarization but for interaction and context modeling.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Our speech detection system exposes performances that make it suitable for our projects and goals such as CHIL [11] or [12]. Actually, we do not want to use it for automatic speech recognition or diarization but for interaction and context modeling.…”
Section: Discussionmentioning
confidence: 99%
“…In these experiments, starting energy thresholds of the energy detector were values empirically defined during previous research projects (NESPOLE! [10] and FAME [11]). The evaluation metrics of our system are given in the three following tables.…”
Section: 42mentioning
confidence: 99%
“…A second prototype, the FAME Interactive Space [Metze and al., 2006], provided access to recordings of lectures via a table top interface that accepted voice commands from a user. The M4 European project (MultiModal Meeting Manager, 2002, introduced a framework for the integration of multimodal data streams and for the detection of group actions [McCowan et al, 2003[McCowan et al, , 2005, and proposed solutions for multimodal tracking of the focus of attention of meeting participants, multimodal summarization, and multimodal information retrieval.…”
Section: Research On Multimodal Human Interaction Analysismentioning
confidence: 99%
“…Error Rate Demirdjian et al [25] Vision & speech 0% Demirdjian et al [25] Speech 5% Demirdjian et al [25] Vision 8% Morency et al [61] Gesture & dialog context 8% Morency and Darrell [60] Gestures & dialog state 9% Quattoni et al [67] Vision & semantics 9% Wang and Demirdjian [86] Speech & gestures 12% Webb et al [87] Speech & dialog state 17% Metze et al [59] Speech, context, & gesture 17% Morency et al [61] Gesture 22% Saenko et al [72] Vision 34% Eisenstein and Davis [31] Linguistic context 34% Bugmann [13] Speech 40% However, these techniques have not been applied to the same extent in Human…”
Section: Hci Techniquesmentioning
confidence: 99%
“…Metze et al describe an Augmented Table which allows several users at the same time to perform multi-modal, cross-lingual document retrieval of audio-visual documents [59]. The Augmented Table enhances multi-lingual speech recognition with context and a visual gesture recognition system, using tokens.…”
mentioning
confidence: 99%