2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018
DOI: 10.1109/icassp.2018.8462042
|View full text |Cite
|
Sign up to set email alerts
|

Constrained Convolutional-Recurrent Networks to Improve Speech Quality with Low Impact on Recognition Accuracy

Abstract: For a speech-enhancement algorithm, it is highly desirable to simultaneously improve perceptual quality and recognition rate. Thanks to computational costs and model complexities, it is challenging to train a model that effectively optimizes both metrics at the same time. In this paper, we propose a method for speech enhancement that combines local and global contextual structures information through convolutional-recurrent neural networks that improves perceptual quality. At the same time, we introduce a new … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 13 publications
0
4
0
Order By: Relevance
“…Multi-task learning for audio quality. Multi-task learning (MTL) [30] has been beneficial to many speech applications [31][32][33][34]. MTL aims to leverage useful information contained in multiple related tasks to help improve the generalization performance on all tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Multi-task learning for audio quality. Multi-task learning (MTL) [30] has been beneficial to many speech applications [31][32][33][34]. MTL aims to leverage useful information contained in multiple related tasks to help improve the generalization performance on all tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Operating in noisy environments affect the performance of speech processing systems that are normally designed based on clean speech signals [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16]. Speech denoising involves the reduction or removal of the noisy part of the speech+noise mixture.…”
Section: Introductionmentioning
confidence: 99%
“…More recently, a type of advanced network named convolutional recurrent neural networks (CRNN) is proposed, which takes advantages of both CNNs and RNNs. It is shown that it can obtain a lower word error rate (WER) in speech recognition tasks [20] and later it is introduced into speech enhancement tasks [21,22]. Compared with the single-type network, experiments have shown that the CRNN obtains better speech enhancement results.…”
Section: Introductionmentioning
confidence: 99%