2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017
DOI: 10.1109/icassp.2017.7953084
|View full text |Cite
|
Sign up to set email alerts
|

A network of deep neural networks for Distant Speech Recognition

Abstract: Despite the remarkable progress recently made in distant speech recognition, state-of-the-art technology still suffers from a lack of robustness, especially when adverse acoustic conditions characterized by non-stationary noises and reverberation are met.A prominent limitation of current systems lies in the lack of matching and communication between the various technologies involved in the distant speech recognition process. The speech enhancement and speech recognition modules are, for instance, often trained… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
6
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 32 publications
(29 citation statements)
references
References 30 publications
0
29
0
Order By: Relevance
“…Batch normalization [3] has been recently proposed in the machine learning community and addresses the so-called internal covariate shift problem by normalizing the mean and the variance of each layer's pre-activations for each training minibatch. Several works have already shown that this technique is effective both to improve the system performance and to speed-up the training procedure [20], [22], [37], [49], [50]. Batch normalization can be applied to RNNs in different ways.…”
Section: Batch Normalizationmentioning
confidence: 99%
See 1 more Smart Citation
“…Batch normalization [3] has been recently proposed in the machine learning community and addresses the so-called internal covariate shift problem by normalizing the mean and the variance of each layer's pre-activations for each training minibatch. Several works have already shown that this technique is effective both to improve the system performance and to speed-up the training procedure [20], [22], [37], [49], [50]. Batch normalization can be applied to RNNs in different ways.…”
Section: Batch Normalizationmentioning
confidence: 99%
“…Despite the progress of the last decade, state-of-the-art speech recognizers are still far away from reaching satisfactory robustness and flexibility. This lack of robustness typically happens when facing challenging acoustic conditions [13], characterized by considerable levels of non-stationary noise and acoustic reverberation [14]- [22]. The development of robust ASR has been recently fostered by the great success of some international challenges such as CHiME [23], REVERB [24] and ASpIRE [25], which were also extremely useful to establish common evaluation frameworks among researchers.…”
Section: Introductionmentioning
confidence: 99%
“…Deep learning has shown remarkable success in numerous speech tasks [1], including speech [2,3] and speaker recognition [4]. This paradigm exploits the principle of compositionality to efficiently describe the world around us and employs a hierarchy of representations that are progressively learned by combining lower-level abstractions.…”
Section: Introductionmentioning
confidence: 99%
“…In the past few years, deep neural networks (DNN) [1] have made tremendous advances, in some cases surpassing human level performance, tackling challenging problems such as speech recognition [2] [3], natural language processing [4] [5], image classification [6] [7] [8], and machine translation [9]. Training of large DNNs, however, is a time consuming and computationally intensive task that demands datacenter scale computational resources composed of state of the art GPUs [6] [10].…”
Section: Introductionmentioning
confidence: 99%