2014
DOI: 10.48550/arxiv.1412.3555
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

Junyoung Chung,
Caglar Gulcehre,
KyungHyun Cho
et al.

Abstract: In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh uni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

7
2,368
1
7

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 2,270 publications
(2,788 citation statements)
references
References 9 publications
7
2,368
1
7
Order By: Relevance
“…VQ is applied using a codebook of 512 vectors of dimensionality 128, with the commitment loss defined as in (14). The aggregator g (•) is implemented as a two-layer gated recurrent neural network (GRU) [35] with 128 hidden channels. Hence, in our experiments, K = E. The InfoNCE loss is computed using 10 negative samples and k = 12 steps.…”
Section: B Parameters Of Proposed Methodsmentioning
confidence: 99%
“…VQ is applied using a codebook of 512 vectors of dimensionality 128, with the commitment loss defined as in (14). The aggregator g (•) is implemented as a two-layer gated recurrent neural network (GRU) [35] with 128 hidden channels. Hence, in our experiments, K = E. The InfoNCE loss is computed using 10 negative samples and k = 12 steps.…”
Section: B Parameters Of Proposed Methodsmentioning
confidence: 99%
“…Previous methods always use an encoder-fusion-decoder paradigm, which first adopts two uni-modal encoders (e.g. ResNet [11] and GRU [4]) to extract images features E I and languages features E L separately, and then designs a modality fusion module to fuse representations from different modalities to obtain the fused features F. In the end, F is fed into a decoder to generate the final segmentation prediction P. This paradigm can be formulated as three steps:…”
Section: Encoder-decoder Pipelinementioning
confidence: 99%
“…LSTM [47] captures the contextual representations of words with a short memory and has additional "forget" gates to thereby overcoming both the vanishing and exploding gradient problem. GRU [48] comprises of reset gate and update gate, and handles the information flow like LSTM sans a memory unit. TextCNN [49] obtains feature representation through 1dim convolution.…”
Section: B Time Sequence Modelingmentioning
confidence: 99%