2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2015
DOI: 10.1109/asru.2015.7404775
|View full text |Cite
|
Sign up to set email alerts
|

RNNDROP: A novel dropout for RNNS in ASR

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
45
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
4
2

Relationship

1
9

Authors

Journals

citations
Cited by 74 publications
(46 citation statements)
references
References 14 publications
0
45
0
Order By: Relevance
“…The third row shows the improvement achieved when adding recurrent dropout. Similarly to [40,41], we applied the same dropout mask for all the time steps to avoid gradient vanishing problems. The fourth line, instead, shows the benefits derived from batch normalization [18].…”
Section: Baselinesmentioning
confidence: 99%
“…The third row shows the improvement achieved when adding recurrent dropout. Similarly to [40,41], we applied the same dropout mask for all the time steps to avoid gradient vanishing problems. The fourth line, instead, shows the benefits derived from batch normalization [18].…”
Section: Baselinesmentioning
confidence: 99%
“…We use Adam [14] for optimization with a learning rate of 1 × 10 − 3 and set β 1 = 0.9, β 2 = 0.99. RnnDrop [24,7] are used in recurrent layers to prevent overfitting.…”
Section: Network Architecturementioning
confidence: 99%
“…In addition to feed-forward layers, dropout can be applied to the convolutional or the recurrent layers. To preserve the spatial or temporal structure while dropping out random nodes, spatial dropout [20] and RnnDrop [21] were proposed for the convolutional and the recurrent layers, respectively. There are several papers that explain how dropout improves the performance [10,13,14], assuming that dropout avoids the co-adaptation problem without any question on it.…”
Section: Dropoutmentioning
confidence: 99%