2015
DOI: 10.1109/tmm.2015.2428998
|View full text |Cite
|
Sign up to set email alerts
|

Detection and Classification of Acoustic Scenes and Events

Abstract: International audience—For intelligent systems to make best use of the audio modality, it is important that they can recognize not just speech and music, which have been researched as specific tasks, but also general sounds in everyday environments. To stimulate research in this field we conducted a public research challenge: the IEEE Audio and Acoustic Signal Processing Technical Committee challenge on Detection and Classification of Acoustic Scenes and Events (DCASE). In this paper, we report on the state of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
332
0
5

Year Published

2016
2016
2020
2020

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 457 publications
(353 citation statements)
references
References 36 publications
2
332
0
5
Order By: Relevance
“…For constructing the pre-extracted dictionary P (f |q, c, s), the IEEE DCASE Event Detection training dataset was used [7,1]. The dataset contains isolated sounds recorded in an office environment at Queen Mary University of London, and covers 16 event classes (s ∈ {1, ..., 16}): alert, clearing throat, cough, door slam, drawer, keyboard click, keys, door knock, laughter, mouse click, page turn, pen drop, phone, printer, speech, and switch.…”
Section: Training Datamentioning
confidence: 99%
See 4 more Smart Citations
“…For constructing the pre-extracted dictionary P (f |q, c, s), the IEEE DCASE Event Detection training dataset was used [7,1]. The dataset contains isolated sounds recorded in an office environment at Queen Mary University of London, and covers 16 event classes (s ∈ {1, ..., 16}): alert, clearing throat, cough, door slam, drawer, keyboard click, keys, door knock, laughter, mouse click, page turn, pen drop, phone, printer, speech, and switch.…”
Section: Training Datamentioning
confidence: 99%
“…In addition, 3 monophonic datasets (1 real and 2 synthesized) of office sounds were also used, for comparative purposes. On the polyphonic datasets, firstly the test dataset for the IEEE DCASE Office Synthetic (OS) challenge was used [1]. The dataset contains 12 recordings of 2min duration each, with 3 different event density levels (low, mid, high) and 3 different SNR levels (-6dB,…”
Section: Test Datamentioning
confidence: 99%
See 3 more Smart Citations