2017 IEEE International Workshop of Electronics, Control, Measurement, Signals and Their Application to Mechatronics (ECMSM) 2017
DOI: 10.1109/ecmsm.2017.7945915
|View full text |Cite
|
Sign up to set email alerts
|

Single channel speech enhancement using convolutional neural network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
27
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 53 publications
(30 citation statements)
references
References 15 publications
0
27
0
Order By: Relevance
“…Moreover, a similar encoder-decoder architecture is developed in [21]. Other studies [9], [38], [24], [1], [14], [15] using CNN for mask estimation or spectral mapping also achieve small performance improvements over a DNN. Recently, Fu et al .…”
Section: Introductionmentioning
confidence: 99%
“…Moreover, a similar encoder-decoder architecture is developed in [21]. Other studies [9], [38], [24], [1], [14], [15] using CNN for mask estimation or spectral mapping also achieve small performance improvements over a DNN. Recently, Fu et al .…”
Section: Introductionmentioning
confidence: 99%
“…Recent applications confirm that CNNs show a good modeling ability for acoustic problems and can outperform state-of-the-art algorithms in this context. Such applications include speech dereverberation [40]- [42], speech enhancement [43]- [45].…”
Section: Convolutional Neural Networkmentioning
confidence: 99%
“…method is compared to noisy speech, conventional single channel SE based on Log-MMSE [9] and dual-microphone method like spectral coherence [45]. SE methods based on DNN [19], single channel CNN based denoising auto encoder (CNN-DAE) [26], and Multi-Objective Learning-based DNN SE [27] methods are implemented and included for comparison. The deep learning-based SE methods were trained on the same datasets as that of proposed method.…”
Section: B Offline Objective Evaluationmentioning
confidence: 99%
“…Researchers have also considered CNN-based end-to-end approach to SE that requires just the raw audio data [25]. Linguistic training and testing of SE based on CNN [26] concluded that the performance of monolingual trained models was on par with multilingual models which makes it better than DNNs. Several features like LPS, Mel Frequency cepstral Coefficients (MFCC), Gammatone Frequency cepstral coefficients (GFCCs) and IBM were employed in multiobjective learning for SE with a DNN architecture to improve the performance in terms of quality and intelligibility of speech [27], [28].…”
Section: Introductionmentioning
confidence: 99%