Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016
DOI: 10.18653/v1/p16-1009
|View full text |Cite
|
Sign up to set email alerts
|

Improving Neural Machine Translation Models with Monolingual Data

Abstract: Neural Machine Translation (NMT) has obtained state-of-the art performance for several language pairs, while only using parallel data for training. Targetside monolingual data plays an important role in boosting fluency for phrasebased statistical machine translation, and we investigate the use of monolingual data for NMT. In contrast to previous work, which combines NMT models with separately trained language models, we note that encoder-decoder NMT architectures already have the capacity to learn the same in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

13
1,811
2
8

Year Published

2017
2017
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 1,885 publications
(2,001 citation statements)
references
References 19 publications
13
1,811
2
8
Order By: Relevance
“…Previous work combines NMT models with separately trained language models (Gülçehre et al 2015). Sennrich et al (2015) show that target-side monolingual data can greatly enhance the decoder model. They do not propose any changes in the network architecture, but rather pair monolingual data with automatic back-translations and treat it as additional training data.…”
Section: Monolingual Datamentioning
confidence: 99%
“…Previous work combines NMT models with separately trained language models (Gülçehre et al 2015). Sennrich et al (2015) show that target-side monolingual data can greatly enhance the decoder model. They do not propose any changes in the network architecture, but rather pair monolingual data with automatic back-translations and treat it as additional training data.…”
Section: Monolingual Datamentioning
confidence: 99%
“…To investigate the effectiveness of incorporating monolingual information with back-translation (Sennrich et al, 2016b), we continued training on top of the base system to build another system (labeled back-trans below) that has some exposure to the monolingual data. Due to the time and hardware constraints, we only took a random sample of 2 million sentences from news crawl 2016 monolingual corpus and 1.5 million sentences from preprocessed CWMT Chinese monolingual corpus from our syntax-based system run and backtranslated them with our trained base system.…”
Section: Enhancements: Back-translation Right-to-left Models Ensemblesmentioning
confidence: 99%
“…Researchers such as Gulccehre et al (2015) proposed to incorporate the target side monolingual corpora as the language model for NMT [16]. As Sennrich et al (2016) pairs the target monolingual corpora with its corresponding translations then merges them with parallel data for retraining the source-target model [25].…”
Section: Related Workmentioning
confidence: 99%