This paper describes Asteroid, the PyTorch-based audio source separation toolkit for researchers. Inspired by the most successful neural source separation systems, it provides all neural building blocks required to build such a system. To improve reproducibility, Kaldi-style recipes on common audio source separation datasets are also provided. This paper describes the software architecture of Asteroid and its most important features. By showing experimental results obtained with Asteroid's recipes, we show that our implementations are at least on par with most results reported in reference papers. The toolkit is publicly available at github.com/mpariente/asteroid.
Ambient sound scenes typically comprise multiple short events occurring on top of a somewhat stationary background. We consider the task of separating these events from the background, which we call foreground-background ambient sound scene separation. We propose a deep learningbased separation framework with a suitable feature normalization scheme and an optional auxiliary network capturing the background statistics, and we investigate its ability to handle the great variety of sound classes encountered in ambient sound scenes, which have often not been seen in training. To do so, we create single-channel foreground-background mixtures using isolated sounds from the DESED and Audioset datasets, and we conduct extensive experiments with mixtures of seen or unseen sound classes at various signal-to-noise ratios. Our experimental findings demonstrate the generalization ability of the proposed approach.
Acoustic scene classification systems face performance degradation when trained and tested on data recorded by different devices. Unsupervised domain adaptation methods have been studied to reduce the impact of this mismatch. While they do not assume the availability of labels at test time, they often exploit parallel data recorded by both devices, and thus are not fully blind to the target domain. In this paper, we address a more practical scenario where parallel data are not available. We thoroughly analyze the impact of normalization and moment matching strategies to compensate for the linear distortion introduced by the recording device and propose their integration with adversarial domain adaptation to handle the remaining non-linear distortion. Experiments on the DCASE Challenge 2018 Task 1B dataset show that the proposed integrated approach considerably reduces domain mismatch, reaching an accuracy in the target domain close to that obtained in the source domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.