Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-775
|View full text |Cite
|
Sign up to set email alerts
|

Improving Speech Recognition by Revising Gated Recurrent Units

Abstract: Speech recognition is largely taking advantage of deep learning, showing that substantial benefits can be obtained by modern Recurrent Neural Networks (RNNs). The most popular RNNs are Long Short-Term Memory (LSTMs), which typically reach state-of-the-art performance in many tasks thanks to their ability to learn long-term dependencies and robustness to vanishing gradients. Nevertheless, LSTMs have a rather complex design with three multiplicative gates, that might impair their efficient implementation. An att… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
44
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 44 publications
(46 citation statements)
references
References 26 publications
2
44
0
Order By: Relevance
“…This work uses hybrid HMM-DNN speech recognizers. TIMIT and DIRHA experiments are performed with the PyTorch-Kaldi toolkit [35] using a six-layer multi-layer perceptron and a light GRU [36], respectively. The performance reported on TIMIT is the average of the phone error rates (PER%) obtained by running each experiment three times with different seeds.…”
Section: Corpora and Asr Setupmentioning
confidence: 99%
“…This work uses hybrid HMM-DNN speech recognizers. TIMIT and DIRHA experiments are performed with the PyTorch-Kaldi toolkit [35] using a six-layer multi-layer perceptron and a light GRU [36], respectively. The performance reported on TIMIT is the average of the phone error rates (PER%) obtained by running each experiment three times with different seeds.…”
Section: Corpora and Asr Setupmentioning
confidence: 99%
“…The third row shows the improvement achieved when adding recurrent dropout. Similarly to [40,41], we applied the same dropout mask for all the time steps to avoid gradient vanishing problems. The fourth line, instead, shows the benefits derived from batch normalization [18].…”
Section: Baselinesmentioning
confidence: 99%
“…Zhou et al [43] accomplished this very 'minimal' gated unit by using a single gate for both resetting and updating the cell's internal state. Ravanelli et al [31] extended that work further by highlighting a redundancy between the two gates. They deduced that in applications like speech recognition where signals change slowly, reset gates are unnecessary and can be omitted altogether.…”
Section: Single Gate Mechanismmentioning
confidence: 91%
“…However, in applications where events of interest are abrupt and isolated (eg. detecting cough sounds), the assumption by Ravanelli et al [31] that state resets are irrelevant does not hold. In fact, we found that without state resets, recurrent units in our application are unable to recover from large impulse signals.…”
Section: Single Gate Mechanismmentioning
confidence: 99%
See 1 more Smart Citation