Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation

Cho, Kyunghyun; Merriënboer, Bart van; Gülçehre, Çağlar; Bahdanau, Dzmitry; Bougares, Fethi; Schwenk, Holger; Bengio, Yoshua

doi:10.3115/v1/d14-1179

Cited by 16,798 publications

(10,072 citation statements)

References 13 publications

Supporting

Mentioning

9,886

Contrasting

Unclassified

103

Order By: Relevance

“…In order to train an NMT (Cho et al, 2014;Sutskever et al, 2014;Bahdanau et al, 2015) model for a languagepair, the size of vocabularies for source and target languages should be constant. But in reality, the vocabulary of a natural language is open.…”

Section: Introductionmentioning

confidence: 99%

Meaningless yet meaningful: Morphology grounded subword-level NMT

Banerjee¹,

Bhattacharyya²

2018

Proceedings of the Second Workshop on Subword/Character LEvel Models

View full text Add to dashboard Cite

We explore the use of two independent subsystems, namely Byte Pair Encoding (BPE) and Morfessor as basic units for subword-level neural machine translation (NMT). We have shown that for linguistically distant language-pairs Morfessor-based segmentation algorithm produces significantly better quality translation than BPE. However, for close language-pairs BPE-based subword-NMT may translate better than Morfessor-based subword-NMT. We have proposed a combined approach of these two segmentation algorithms Morfessor-BPE (M-BPE) which outperforms these two baseline systems in terms of BLEU score. Our results are supported by experiments on three language-pairs:

show abstract

Section: Introductionmentioning

confidence: 99%

Meaningless yet meaningful: Morphology grounded subword-level NMT

Banerjee¹,

Bhattacharyya²

2018

Proceedings of the Second Workshop on Subword/Character LEvel Models

View full text Add to dashboard Cite

show abstract

“…As Tilk et al, we use gated recurrent units (GRU) [8] for the RNN layers. Introduced as a simpler variate of long short-term memory (LSTM) units [11], GRUs make computation simpler by having fewer parameters.…”

Section: Our Modelmentioning

confidence: 99%

Attentional Parallel RNNs for Generating Punctuation in Transcribed Speech

Öktem

Farrús

Wanner

2017

Statistical Language and Speech Processing

View full text Add to dashboard Cite

Abstract. Until very recently, the generation of punctuation marks for automatic speech recognition (ASR) output has been mostly done by looking at the syntactic structure of the recognized utterances. Prosodic cues such as breaks, speech rate, pitch intonation that influence placing of punctuation marks on speech transcripts have been seldom used. We propose a method that uses recurrent neural networks, taking prosodic and lexical information into account in order to predict punctuation marks for raw ASR output. Our experiments show that an attention mechanism over parallel sequences of prosodic cues aligned with transcribed speech improves accuracy of punctuation generation.

show abstract

“…Favour underlying semantic and syntactic information in natural language texts, and save researchers the efforts of feature engineering [14,15]. Recently, they have achieved significant improvements in various natural language processing tasks, such as Machine Translation [2,3], Question Answering [14], Sentiment Analysis [6,11,15,18], etc. However, applying deep neural networks on target-specific Stance Detection has not been successful, as their performances have, up to now, been slightly worse than traditional machine learning algorithms with manual feature engineering, such as Support Vector Machines (SVM) [8].…”

Section: Againstmentioning

confidence: 99%

Connecting Targets to Tweets: Semantic Attention-Based Model for Target-Specific Stance Detection

Zhou

Cristea

Shi

2017

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Copies of full items can be used for personal research or study, educational, or not-for profit purposes without prior permission or charge. Provided that the authors, title and full bibliographic details are credited, a hyperlink and/or URL is given for the original metadata page and the content is not changed in any way. Publisher's statement:"The final publication is available at Springer via https://doi.org/10.1007/978-3-319-68783-4_2 " A note on versions:The version presented here may differ from the published version or, version of record, if you wish to cite this item you are advised to consult the publisher's version. Please see the 'permanent WRAP url' above for details on accessing the published version and note that access may require a subscription. Abstract. Understanding what people say and really mean in tweets is still a wide open research question. In particular, understanding the stance of a tweet, which is determined not only by its content, but also by the given target, is a very recent research aim of the community. It still remains a challenge to construct a tweet's vector representation with respect to the target, especially when the target is only implicitly mentioned, or not mentioned at all in the tweet. We believe that better performance can be obtained by incorporating the information of the target into the tweet's vector representation. In this paper, we thus propose to embed a novel attention mechanism at the semantic level in the bi-directional GRU-CNN structure, which is more fine-grained than the existing token-level attention mechanism. This novel attention mechanism allows the model to automatically attend to useful semantic features of informative tokens in deciding the target-specific stance, which further results in a conditional vector representation of the tweet, with respect to the given target. We evaluate our proposed model on a recent, widely applied benchmark Stance Detection dataset from Twitter for the SemEval-2016 Task 6.A. Experimental results demonstrate that the proposed model substantially outperforms several strong baselines, which include the state-of-the-art token-level attention mechanism on bi-directional GRU outputs and the SVM classifier.

show abstract

Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation

Cited by 16,798 publications

References 13 publications

Meaningless yet meaningful: Morphology grounded subword-level NMT

Meaningless yet meaningful: Morphology grounded subword-level NMT

Attentional Parallel RNNs for Generating Punctuation in Transcribed Speech

Connecting Targets to Tweets: Semantic Attention-Based Model for Target-Specific Stance Detection

Contact Info

Product

Resources

About