2019
DOI: 10.1186/s13634-019-0618-4
|View full text |Cite
|
Sign up to set email alerts
|

Multi-resolution auditory cepstral coefficient and adaptive mask for speech enhancement with deep neural network

Abstract: The performance of the existing speech enhancement algorithms is not ideal in low signal-to-noise ratio (SNR) non-stationary noise environments. In order to resolve this problem, a novel speech enhancement algorithm based on multi-feature and adaptive mask with deep learning is presented in this paper. First, we construct a new feature called multi-resolution auditory cepstral coefficient (MRACC). This feature which is extracted from four cochleagrams of different resolutions can capture the local information … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 22 publications
(18 reference statements)
0
2
0
Order By: Relevance
“…A similar method of DNN-based mask estimation was done in past studies [19][20][21][22][23]. A study in [19] modified the feature and proposed adaptive masks in the DNN-based mask estimation with four hidden layers and 1024 hidden nodes. As a result, average PESQ and STOI scores of 2.12 and 0.78, respectively, were obtained at -5dB SNR [19].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…A similar method of DNN-based mask estimation was done in past studies [19][20][21][22][23]. A study in [19] modified the feature and proposed adaptive masks in the DNN-based mask estimation with four hidden layers and 1024 hidden nodes. As a result, average PESQ and STOI scores of 2.12 and 0.78, respectively, were obtained at -5dB SNR [19].…”
Section: Related Workmentioning
confidence: 99%
“…A study in [19] modified the feature and proposed adaptive masks in the DNN-based mask estimation with four hidden layers and 1024 hidden nodes. As a result, average PESQ and STOI scores of 2.12 and 0.78, respectively, were obtained at -5dB SNR [19]. Another study in [23] used the features fusion technique for the DNN input, while the phase-aware and magnitude mask were applied as the target mask.…”
Section: Related Workmentioning
confidence: 99%