2020 IEEE International Conference on Multimedia and Expo (ICME) 2020
DOI: 10.1109/icme46284.2020.9102846
|View full text |Cite
|
Sign up to set email alerts
|

Snr-Based Teachers-Student Technique For Speech Enhancement

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…Hinton et al [16] trained a smaller student network with labels generated by a stronger teacher model, comprising an ensemble of models. Hao et al [17] used an SNR-based TS method to train an SE model using an ensemble of different models, which were individually trained for different SNR Ranges. Kobayashi et al [18] used KD to train a unidirectional recurrent student network with a bi-directional teacher network inspired by Born Again Networks [19].…”
Section: Related Workmentioning
confidence: 99%
“…Hinton et al [16] trained a smaller student network with labels generated by a stronger teacher model, comprising an ensemble of models. Hao et al [17] used an SNR-based TS method to train an SE model using an ensemble of different models, which were individually trained for different SNR Ranges. Kobayashi et al [18] used KD to train a unidirectional recurrent student network with a bi-directional teacher network inspired by Born Again Networks [19].…”
Section: Related Workmentioning
confidence: 99%
“…[22] designed a two-stage training distillation method and a co-worker-based network to improve the performance of SE. In the time domain, to improve performance at both low and high signal-to-noise ratios (SNRs), [23] built multiple teachers trained under SNRs and then transferred knowledge to the student network. Further, [24] applied standard KD [10] to reduce the system latency while preventing performance degradation.…”
Section: Introductionmentioning
confidence: 99%
“…First, for challenging acoustic scenarios as low SNR conditions, current SE systems usually suffer from performance bottlenecks in recovering clean speech from mixtures (Li et al 2021). Second, noise intensities in real-world scenes change dynamically, which requires SE systems to accommodate wide expansion of the SNR range and raises the difficulty of the network design (Hao et al 2020). The third problem is caused by the limited kernel size of convolution layers, which often results in a short-sighted feature extractor.…”
Section: Introductionmentioning
confidence: 99%