On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

Cho, Kyunghyun; Merriënboer, Bart van; Bahdanau, Dzmitry; Bengio, Yoshua

doi:10.3115/v1/w14-4012

Cited by 4,682 publications

(2,514 citation statements)

References 6 publications

Supporting

Mentioning

2,471

Contrasting

Unclassified

Order By: Relevance

“…Our model makes extensive use of RNN encoders to transform sequences into fixed length vectors. For our purposes, an RNN encoder consists of GRU units (Cho et al, 2014) defined as…”

Section: Preliminaries and Notationmentioning

confidence: 99%

Accurate Supervised and Semi-Supervised Machine Reading for Long Documents

Hewlett¹,

Jones²,

Lacoste³

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

We introduce a hierarchical architecture for machine reading capable of extracting precise information from long documents. The model divides the document into small, overlapping windows and encodes all windows in parallel with an RNN. It then attends over these window encodings, reducing them to a single encoding, which is decoded into an answer using a sequence decoder. This hierarchical approach allows the model to scale to longer documents without increasing the number of sequential steps. In a supervised setting, our model achieves state of the art accuracy of 76.8 on the WikiReading dataset. We also evaluate the model in a semi-supervised setting by downsampling the WikiReading training set to create increasingly smaller amounts of supervision, while leaving the full unlabeled document corpus to train a sequence autoencoder on document windows. We evaluate models that can reuse autoencoder states and outputs without finetuning their weights, allowing for more efficient training and inference.

show abstract

“…Our model makes extensive use of RNN encoders to transform sequences into fixed length vectors. For our purposes, an RNN encoder consists of GRU units (Cho et al, 2014) defined as…”

Section: Preliminaries and Notationmentioning

confidence: 99%

Accurate Supervised and Semi-Supervised Machine Reading for Long Documents

Hewlett¹,

Jones²,

Lacoste³

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…The variations we tried includei) using only LSTM/CNN plus fully connected layers, and also the combination of these architectures with the initial LSTM's output for each word fed to a CNN, or vice versa. ii) Using Simple RNN, Bidirectional LSTM ( (Schuster and Paliwal, 1997), (Godin et al, 2015)), Gated Recurrent Units (GRU) (Cho et al, 2014) instead of LSTM. iii) Using (global) max pooling versus (global) average pooling for CNNs.…”

Section: Approach 3: Sequence Modeling Using Cnns and Lstmsmentioning

confidence: 99%

Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets

Goel¹,

Kulshreshtha²,

Jain³

et al. 2017

Proceedings of the 8th Workshop on Computational Approaches To Subjectivity, Sentiment and Social Media Analysis

View full text Add to dashboard Cite

The paper describes the best performing system for EmoInt -a shared task to predict the intensity of emotions in tweets. Intensity is a real valued score, between 0 and 1. The emotions are classified as -anger, fear, joy and sadness. We apply three different deep neural network based models, which approach the problem from essentially different directions. Our final performance quantified by an average pearson correlation score of 74.7 and an average spearman correlation score of 73.5 is obtained using an ensemble of the three models. We outperform the baseline model of the shared task by 9.9% and 9.4% pearson and spearman correlation scores respectively.

show abstract

“…We propose a Natural Language Question Generation (NLQG) model that first encodes the input sequence using some distributed representation and then decodes the output sequence from this encoded representation. Specifically, we use a RNN based encoder and decoder recently proposed for language processing tasks by number of groups (Cho et al, 2014;Sutskever et al, 2014). We now formally define the encoder and decoder models.…”

Section: Rnn Based Natural Language Question Generatormentioning

confidence: 99%

Generating Natural Language Question-Answer Pairs from a Knowledge Graph Using a RNN Based Question Generation Model

Reddy¹,

Raghu²,

Khapra³

et al. 2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 1

View full text Add to dashboard Cite

In recent years, knowledge graphs such as Freebase that capture facts about entities and relationships between them have been used actively for answering factoid questions. In this paper, we explore the problem of automatically generating question answer pairs from a given knowledge graph. The generated question answer (QA) pairs can be used in several downstream applications. For example, they could be used for training better QA systems. To generate such QA pairs, we first extract a set of keywords from entities and relationships expressed in a triple stored in the knowledge graph. From each such set, we use a subset of keywords to generate a natural language question that has a unique answer. We treat this subset of keywords as a sequence and propose a sequence to sequence model using RNN to generate a natural language question from it. Our RNN based model generates QA pairs with an accuracy of 33.61 percent and performs 110.47 percent (relative) better than a state-of-the-art template based method for generating natural language question from keywords. We also do an extrinsic evaluation by using the generated QA pairs to train a QA system and observe that the F1-score of the QA system improves by 5.5 percent (relative) when using automatically generated QA pairs in addition to manually generated QA pairs available for training.

show abstract

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

Cited by 4,682 publications

References 6 publications

Accurate Supervised and Semi-Supervised Machine Reading for Long Documents

Accurate Supervised and Semi-Supervised Machine Reading for Long Documents

Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets

Generating Natural Language Question-Answer Pairs from a Knowledge Graph Using a RNN Based Question Generation Model

Contact Info

Product

Resources

About