2007
DOI: 10.1007/s10579-007-9054-4
|View full text |Cite
|
Sign up to set email alerts
|

The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms

Abstract: The analysis of lectures and meetings inside smart rooms has recently attracted much interest in the literature, being the focus of international projects and technology evaluations. A key enabler for progress in this area is the availability of Ambrish Tyagi has contributed to this work during two summer internships with the IBM T.appropriate multimodal and multi-sensory corpora, annotated with rich human activity information during lectures and meetings. This paper is devoted to exactly such a corpus, develo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
38
0
2

Year Published

2012
2012
2022
2022

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 71 publications
(42 citation statements)
references
References 3 publications
0
38
0
2
Order By: Relevance
“…The application of speaker diarization to the meeting domain triggered the need for dealing with multiple microphones which are often used to record the same meeting from different locations in the room [35]- [37]. The microphones can have different characteristics: wall-mounted microphones (intended for speaker localization), lapel microphones, desktop microphones positioned on the meeting room table or microphone arrays.…”
Section: A Acoustic Beamformingmentioning
confidence: 99%
“…The application of speaker diarization to the meeting domain triggered the need for dealing with multiple microphones which are often used to record the same meeting from different locations in the room [35]- [37]. The microphones can have different characteristics: wall-mounted microphones (intended for speaker localization), lapel microphones, desktop microphones positioned on the meeting room table or microphone arrays.…”
Section: A Acoustic Beamformingmentioning
confidence: 99%
“…The performance and comparison of the proposed approach has been evaluated over the acoustic event recognition task in CHIL2007 [1], a database of seminar recordings where 12 acoustic event classes appear besides speech (around 60% of the acoustic events are reportedly overlapped with speech). The experimental conditions are summarized in Table 1.…”
Section: Experimental Evaluationmentioning
confidence: 99%
“…The goal is to process a continuous acoustic signal and convert it into a sequence of event labels with associated start and end times. Rich tran- scription in speech communication [1], [2] and scene understanding [3], [4] benefit from it, but also informed speech enhancement and automatic speech recognition (ASR) systems could benefit from it as a source of information. Recent hands-free meeting analysis systems already include simple event detection components in order to differentiate speech from laughter [5], but achieving a richer acoustic event recognition could effectively support speech detection and informed speech enhancement [6] by providing detailed description of the surrounding noises, besides the obvious benefits of richer transcriptions.…”
Section: Introductionmentioning
confidence: 99%
“…These have typically focussed on genuine distant microphone speech recognition scenarios but ones that feature relatively benign environments where high SNRs can be expected, e.g., meeting rooms (Renals et al, 2008;RWCP, 2001) and lecture halls (Mostefa et al, 2007). These evaluations have featured multichannel signal recordings and have led to the development of highly effective microphone array processing strategies that can be used as part of the speech recognition tool chain.…”
Section: Introductionmentioning
confidence: 99%