Interspeech 2015 2015
DOI: 10.21437/interspeech.2015-536
|View full text |Cite
|
Sign up to set email alerts
|

Deep neural network based spectral feature mapping for robust speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
15
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 34 publications
(17 citation statements)
references
References 19 publications
1
15
0
Order By: Relevance
“…With this spectral feature mapping (SFM) approach, we can pass the output of our enhancement model directly to the ASR model (Figure 1). While deep learning has previously been applied to SFM for ASR [17,18,19], our work is the first to use GANs for this task. Michelsanti et al [20] employ GANs for SFM, but target speaker verification rather than ASR.…”
Section: Introductionmentioning
confidence: 99%
“…With this spectral feature mapping (SFM) approach, we can pass the output of our enhancement model directly to the ASR model (Figure 1). While deep learning has previously been applied to SFM for ASR [17,18,19], our work is the first to use GANs for this task. Michelsanti et al [20] employ GANs for SFM, but target speaker verification rather than ASR.…”
Section: Introductionmentioning
confidence: 99%
“…Spectral mapping has been used to generate clean speech signals. However, in [8,7] they use only a local learning objective. Student-teacher networks have been used to improve the quality of noisy speech recognition [16,17,18].…”
Section: Prior Workmentioning
confidence: 99%
“…We train a DNN-based spectral mapper for feature denoising. In our previous work [7,8], we have shown that a DNN-based spectral mapper, which takes noisy spectrogram as input to predict clean filterbank features for ASR, yields good results on the CHiME-2 noisy and reverberant dataset. Specifically, we first divide the input time-domain signals into 25-ms frames with a 10-ms frame shift, and then apply short time Fourier transform (STFT) to compute log spectral magnitudes in each time frame.…”
Section: Spectral Mappingmentioning
confidence: 99%
See 1 more Smart Citation
“…Traditional signal processing-based methods, such as the Wiener filtering and Spectral Subtraction, among many others, provide noise reduction based on signal processing algorithms. More recently, Deep Neural Networks (DNN) ⋆ Supported by the University of Costa Rica have been presented in [6,7,8,9]. The main approach for DNN is the mapping of spectral features from noisy speech into the features of the corresponding clean speech.…”
Section: Introductionmentioning
confidence: 99%