Romain Serizel scite author profile

This paper presents Task 4 of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 challenge and provides a first analysis of the challenge results. The task is a followup to Task 4 of DCASE 2018, and involves training systems for large-scale detection of sound events using a combination of weakly labeled data, i.e. training labels without time boundaries, and strongly-labeled synthesized data. We introduce the Domestic Environment Sound Event Detection (DESED) dataset, mixing a part of last year's dataset and an additional synthetic, strongly labeled, dataset provided this year that we describe in more detail. We also report the performance of the submitted systems on the official evaluation (test) and development sets as well as several additional datasets. The best systems from this year outperform last year's winning system by about 10% points in terms of F-measure.

show abstract

CRNN-Based Multiple DoA Estimation Using Acoustic Intensity Features for Ambisonics Recordings

Perotin

Serizel

Vincent

et al. 2019

IEEE J. Sel. Top. Signal Process.

117

126

View full text Add to dashboard Cite

Localizing audio sources is challenging in real reverberant environments, especially when several sources are active. We propose to use a neural network built from stacked convolutional and recurrent layers in order to estimate the directions of arrival of multiple sources from a first-order Ambisonics recording. It returns the directions of arrival over a discrete grid of a known number of sources. We propose to use features derived from the acoustic intensity vector as inputs. We analyze the behavior of the neural network by means of a visualization technique called layerwise relevance propagation. This analysis highlights which parts of the input signal are relevant in a given situation. We also conduct experiments to evaluate the performance of our system in various environments, from simulated rooms to real recordings, with one or two speech sources. The results show that the proposed features significantly improve performances with respect to raw Ambisonics inputs.

show abstract

Low-rank Approximation Based Multichannel Wiener Filter Algorithms for Noise Reduction with Application in Cochlear Implants

Serizel

Moonen

Dijk³

et al. 2014

IEEE/ACM Trans. Audio Speech Lang. Process.

113

114

View full text Add to dashboard Cite

This paper presents low-rank approximation based multichannel Wiener filter algorithms for noise reduction in speech plus noise scenarios, with application in cochlear implants. In a single speech source scenario, the frequency-domain autocorrelation matrix of the speech signal is often assumed to be a rank-1 matrix, which then allows to derive different rank-1 approximation based noise reduction filters. In practice, however, the rank of the autocorrelation matrix of the speech signal is usually greater than one. Firstly, the link between the different rank-1 approximation based noise reduction filters and the original speech distortion weighted multichannel Wiener filter is investigated when the rank of the autocorrelation matrix of the speech signal is indeed greater than one.

show abstract

Feature Learning With Matrix Factorization Applied to Acoustic Scene Classification

Bisot

Serizel

Essid

et al. 2017

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Romain Serizel

Sound Event Detection in Domestic Environments with Weakly Labeled Data and Soundscape Synthesis

CRNN-Based Multiple DoA Estimation Using Acoustic Intensity Features for Ambisonics Recordings

Low-rank Approximation Based Multichannel Wiener Filter Algorithms for Noise Reduction with Application in Cochlear Implants

Feature Learning With Matrix Factorization Applied to Acoustic Scene Classification

Contact Info

Product

Resources

About