Improving Neural Machine Translation Models with Monolingual Data

Sennrich, Rico; Haddow, Barry; Birch, Alexandra

doi:10.18653/v1/p16-1009

Cited by 1,885 publications

(2,001 citation statements)

References 19 publications

Supporting

Mentioning

1,811

Contrasting

Unclassified

Order By: Relevance

“…Previous work combines NMT models with separately trained language models (Gülçehre et al 2015). Sennrich et al (2015) show that target-side monolingual data can greatly enhance the decoder model. They do not propose any changes in the network architecture, but rather pair monolingual data with automatic back-translations and treat it as additional training data.…”

Section: Monolingual Datamentioning

confidence: 99%

Neural machine translation for low-resource languages without parallel corpora

Karakanta

Dehdari

Genabith

2017

Machine Translation

View full text Add to dashboard Cite

The problem of a total absence of parallel data is present for a large number of language pairs and can severely detriment the quality of machine translation. We describe a language-independent method to enable machine translation between a low-resource language (LRL) and a third language, e.g. English. We deal with cases of LRLs for which there is no readily available parallel data between the low-resource language and any other language, but there is ample training data between a closelyrelated high-resource language (HRL) and the third language. We take advantage of the similarities between the HRL and the LRL in order to transform the HRL data into data similar to the LRL using transliteration. The transliteration models are trained on transliteration pairs extracted from Wikipedia article titles. Then, we automatically back-translate monolingual LRL data with the models trained on the transliterated HRL data and use the resulting parallel corpus to train our final models. Our method achieves significant improvements in translation quality, close to the results that can be achieved by a general purpose neural machine translation system trained on a significant amount of parallel data. Moreover, the method does not rely on the existence of any parallel data for training, but attempts to bootstrap already existing resources in a related language.

show abstract

Section: Monolingual Datamentioning

confidence: 99%

Neural machine translation for low-resource languages without parallel corpora

Karakanta

Dehdari

Genabith

2017

Machine Translation

View full text Add to dashboard Cite

show abstract

“…To investigate the effectiveness of incorporating monolingual information with back-translation (Sennrich et al, 2016b), we continued training on top of the base system to build another system (labeled back-trans below) that has some exposure to the monolingual data. Due to the time and hardware constraints, we only took a random sample of 2 million sentences from news crawl 2016 monolingual corpus and 1.5 million sentences from preprocessed CWMT Chinese monolingual corpus from our syntax-based system run and backtranslated them with our trained base system.…”

Section: Enhancements: Back-translation Right-to-left Models Ensemblesmentioning

confidence: 99%

The JHU Machine Translation Systems for WMT 2017

Ding¹,

Khayrallah²,

Koehn

et al. 2017

Proceedings of the Second Conference on Machine Translation

View full text Add to dashboard Cite

show abstract

“…Researchers such as Gulccehre et al (2015) proposed to incorporate the target side monolingual corpora as the language model for NMT [16]. As Sennrich et al (2016) pairs the target monolingual corpora with its corresponding translations then merges them with parallel data for retraining the source-target model [25].…”

Section: Related Workmentioning

confidence: 99%