Detection and Classification of Acoustic Scenes and Events

Stowell, Dan; Giannoulis, Dimitrios; Benetos, Emmanouil; Lagrange, Mathieu; Plumbley, Mark D.

doi:10.1109/tmm.2015.2428998

Cited by 457 publications

(353 citation statements)

References 36 publications

Supporting

Mentioning

332

Contrasting

Unclassified

Order By: Relevance

“…For constructing the pre-extracted dictionary P (f |q, c, s), the IEEE DCASE Event Detection training dataset was used [7,1]. The dataset contains isolated sounds recorded in an office environment at Queen Mary University of London, and covers 16 event classes (s ∈ {1, ..., 16}): alert, clearing throat, cough, door slam, drawer, keyboard click, keys, door knock, laughter, mouse click, page turn, pen drop, phone, printer, speech, and switch.…”

Section: Training Datamentioning

confidence: 99%

“…In addition, 3 monophonic datasets (1 real and 2 synthesized) of office sounds were also used, for comparative purposes. On the polyphonic datasets, firstly the test dataset for the IEEE DCASE Office Synthetic (OS) challenge was used [1]. The dataset contains 12 recordings of 2min duration each, with 3 different event density levels (low, mid, high) and 3 different SNR levels (-6dB,…”

Section: Test Datamentioning

confidence: 99%

“…For comparative purposes, 3 monophonic datasets of office sounds were used. Firstly, the Office Live (OL) dataset from the DCASE challenge was used [1], which contains 11 scripted recordings of event sequences recorded at Queen Mary University of London. The second and third monophonic datasets were generated using the acoustic scene synthesizer of [20], and each include 22 recordings of variable duration (1-3min), using as basis the annotations for the OL dataset.…”

Section: Test Datamentioning

confidence: 99%

“…For evaluation, the same set of metrics used for the IEEE DCASE event detection tasks was used [1]. Specifically, 3 different metrics are used: frame-based, event-based, and class-wise event-based.…”

Section: Metricsmentioning

confidence: 99%

“…The main goal of the aforementioned task is to label temporal regions within an audio recording, resulting in a symbolic description with start and end times, as well as labels for each instance of a specific event type [1]. Applications for acoustic event detection are numerous, including but not limited to security and surveillance, urban planning, smart homes, acoustic ecology, and organisation/navigation of sound archives [1,2,3,4].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Detection of overlapping acoustic events using a temporally-constrained probabilistic model

Benetos

Lafay

Lagrange

et al. 2016

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

In this paper, a system for overlapping acoustic event detection is proposed, which models the temporal evolution of sound events. The system is based on probabilistic latent component analysis, supporting the use of a sound event dictionary where each exemplar consists of a succession of spectral templates. The temporal succession of the templates is controlled through event class-wise Hidden Markov Models (HMMs). As input time/frequency representation, the Equivalent Rectangular Bandwidth (ERB) spectrogram is used. Experiments are carried out on polyphonic datasets of office sounds generated using an acoustic scene simulator, as well as real and synthesized monophonic datasets for comparative purposes. Results show that the proposed system outperforms several state-of-the-art methods for overlapping acoustic event detection on the same task, using both frame-based and event-based metrics, and is robust to varying event density and noise levels.

show abstract

Section: Training Datamentioning

confidence: 99%

Section: Test Datamentioning

confidence: 99%

Section: Test Datamentioning

confidence: 99%

Section: Metricsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Detection of overlapping acoustic events using a temporally-constrained probabilistic model

Benetos

Lafay

Lagrange

et al. 2016

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

show abstract

Multichannel Source Activity Detection, Localization, and Tracking

Pertilä

Brutti

Svaizer

et al. 2018

Audio Source Separation and Speech Enhancement

View full text Add to dashboard Cite

Environmental sound processing and its applications

Miyazaki

Toda

Hayashi

et al. 2019

IEEJ Transactions Elec Engng

View full text Add to dashboard Cite

As part of the effort to develop techniques for understanding environments using sound, many studies in the field of computational auditory scene analysis have focused on using computers to perform functions carried out naturally by the human auditory system. Thanks to recent progress in machine‐learning techniques, these environmental sound‐processing techniques have significantly improved and a widening variety of applications has resulted in considerable interest in this field. In this review, we introduce the fundamental techniques of environmental sound processing, as well as recent advances in front‐end and back‐end processing and potential applications for these techniques. Prospects for further progress in the field of environmental sound processing and the challenges still to be overcome are also discussed. © 2019 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

show abstract

Detection and Classification of Acoustic Scenes and Events

Cited by 457 publications

References 36 publications

Detection of overlapping acoustic events using a temporally-constrained probabilistic model

Detection of overlapping acoustic events using a temporally-constrained probabilistic model

Multichannel Source Activity Detection, Localization, and Tracking

Environmental sound processing and its applications

Contact Info

Product

Resources

About