2022
DOI: 10.1109/jstsp.2022.3200911
|View full text |Cite
|
Sign up to set email alerts
|

RemixIT: Continual Self-Training of Speech Enhancement Models via Bootstrapped Remixing

Abstract: We present RemixIT, a simple yet effective selfsupervised method for training speech enhancement without the need of a single isolated in-domain speech nor a noise waveform. Our approach overcomes limitations of previous methods which make them dependent on clean in-domain target signals and thus, sensitive to any domain mismatch between train and test samples. RemixIT is based on a continuous self-training scheme in which a pre-trained teacher model on out-of-domain data infers estimated pseudo-target signals… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 22 publications
(8 citation statements)
references
References 55 publications
0
2
0
Order By: Relevance
“…The problem has been also tackled from the perspective of generative modeling, with the use of variational auto-encoders [9], [37]. Finally, teacher-student training schemes have been employed, in which an OOD teacher model is used to provide targets for supervised training of a student model on target data [38], [39]. Although numerous approaches are available, no systematic comparison on a common data corpus has been done in the literature.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…The problem has been also tackled from the perspective of generative modeling, with the use of variational auto-encoders [9], [37]. Finally, teacher-student training schemes have been employed, in which an OOD teacher model is used to provide targets for supervised training of a student model on target data [38], [39]. Although numerous approaches are available, no systematic comparison on a common data corpus has been done in the literature.…”
Section: Related Workmentioning
confidence: 99%
“…Although numerous approaches are available, no systematic comparison on a common data corpus has been done in the literature. In this article, we compare our method to noisy-target training (Nytt) [24], and RemixIT [38], a recent teacher-student training scheme.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition, unsupervised or semi-supervised training methods have also been investigated to achieve a general solution even on out-of-distribution datasets. A particularly interesting method is the teacher-student training method proposed in RemixIT [36] where a teacher network trained on out-of-distribution data is used to bootstrap the noisy signals to multiply the variety of in-distribution data samples.…”
Section: Current State-of-the-art Solutionsmentioning
confidence: 99%
“…Methodologies inspired by conventional deep learning, e.g. multi-timescale networks [29,31,36] or attention [28], if mapped efficiently to the neuromorphic domain, could be promising directions as well. And finally for completeness-to address the third question posed above-decoding the output of the neuromorphically-processed audio again depends on the processing used and must be tailored appropriately to operate in an efficient manner.…”
Section: Neuromorphic Audio Processing and Promising Directionsmentioning
confidence: 99%