ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9053396
|View full text |Cite
|
Sign up to set email alerts
|

Source Separation with Weakly Labelled Data: an Approach to Computational Auditory Scene Analysis

Abstract: Source separation is the task to separate an audio recording into individual sound sources. Source separation is fundamental for computational auditory scene analysis. Previous work on source separation has focused on separating particular sound classes such as speech and music. Many of previous work require mixture and clean source pairs for training. In this work, we propose a source separation framework trained with weakly labelled data. Weakly labelled data only contains the tags of an audio clip, without … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
61
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 32 publications
(61 citation statements)
references
References 21 publications
0
61
0
Order By: Relevance
“…Tzinis et al [4] performed separation experiments with a fixed number of sources on the 50-class ESC-50 dataset [5]. Other papers have leveraged information about sound class, either as conditioning information or as as a weak supervision signal [6,2,7].…”
Section: Relation To Prior Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Tzinis et al [4] performed separation experiments with a fixed number of sources on the 50-class ESC-50 dataset [5]. Other papers have leveraged information about sound class, either as conditioning information or as as a weak supervision signal [6,2,7].…”
Section: Relation To Prior Workmentioning
confidence: 99%
“…However, none of these approaches explicitly solved the problem of non-target events. Sound separation can be used for SED by first separating the component sounds in a mixed signal and then applying SED on each of the separated tracks [15,7,16,17,18].…”
Section: Relation To Prior Workmentioning
confidence: 99%
“…al. [8], who also train a neural network for conditional source-separation of singlechannel audio. This approach uses a classification model trained on AudioSet [5], which consists of 10s segments with weak class labels.…”
Section: Related Workmentioning
confidence: 99%
“…For each such class, a one-hot vector indicating the selected class is then used to extract the different sources. In short, a key difference between Kong et al [8] and our approach is that the former requires labeled data to train the classifier model, whereas our SoundFilter operates in a fully unlabeled setup. In addition, the embedding used in [8] is defined in terms of AudioSet's class ontology.…”
Section: Related Workmentioning
confidence: 99%
“…For speech separation, different methods have been designed [9][10][11][12][13][14]. Approaches such as Computational Auditory Scene Analysis (CASA) [15][16][17][18], Hidden Markov Model (HMM) [19][20][21], HMM in conjunction with Cepstral Coefficients for Mel Frequency [22][23][24], Nonnegative Factorization of Matrix(NMF) [25][26][27][28] and Minimal Mean Square Error(MMSE) [29][30][31][32]. However, these strategies have seen relatively little success.…”
Section: Imentioning
confidence: 99%