ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8683520
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Training of a Deep Clustering Model for Multichannel Blind Source Separation

Abstract: We propose a training scheme to train neural network-based source separation algorithms from scratch when parallel clean data is unavailable. In particular, we demonstrate that an unsupervised spatial clustering algorithm is sufficient to guide the training of a deep clustering system. We argue that previous work on deep clustering requires strong supervision and elaborate on why this is a limitation. We demonstrate that (a) the single-channel deep clustering system trained according to the proposed scheme alo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
34
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 42 publications
(38 citation statements)
references
References 32 publications
1
34
0
Order By: Relevance
“…Unlike the proposed method, CACGMM does not have a reverberation model. Originally, a CACGMM based method without phase difference feature was proposed in [19]. However, the proposed method utilizes phase difference between microphones as an input feature.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Unlike the proposed method, CACGMM does not have a reverberation model. Originally, a CACGMM based method without phase difference feature was proposed in [19]. However, the proposed method utilizes phase difference between microphones as an input feature.…”
Section: Resultsmentioning
confidence: 99%
“…For speech source separation performance evaluation, we utilized SDR and SIR from BSS EVAL [29]. Four methods were evaluated, i.e., 1) Conventional unsupervised training method with complex angular central Gaussian mixture model (CACGMM) [19]: Time-frequency mask of each source is inferred with the sparseness assumption. This model does not have any reverberation model.…”
Section: Evaluation Measures and Comparative Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Unsupervised training for neural source separation using multichannel mixture signals has recently gained a lot of attention [12][13][14][15]. One approach is to generate supervised data by using multichannel separation methods [12][13][14]. This approach suffers from the estimation errors of the multichannel methods mentioned above.…”
Section: Introductionmentioning
confidence: 99%
“…Consequently, supervised source separation using neural networks relies on the availability of paired mixtureclean data in the training set and cannot be used when such paired datasets are not available or expensive to collect. To relax these constraints, a few recent papers use other forms of information like the spatial separation between the sources in a multi-microphone setting, to train the networks for unsupervised source separation [7,8,9,10]. However, these constraints continue to impose restrictions on singlechannel source separation, where such secondary forms of information about the sources are not available.…”
Section: Introductionmentioning
confidence: 99%