Using Target-side Monolingual Data for Neural Machine Translation
            through Multi-task Learning

Domhan, Tobias; Hieber, Felix

doi:10.18653/v1/d17-1158

Cited by 71 publications

(71 citation statements)

References 11 publications

(12 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One approach of using target monolingual corpora is to construct a recurrent neural network language model and combine the model with the decoder (Gülçehere et al, 2015;Sriram et al, 2017). Similarly, there is a method of training language models, jointly with the translator, using multitask learning (Domhan and Hieber, 2017 Another approach of using monolingual corpora of the target language is to learn models using synthetic parallel sentences. The method of Sennrich et al (2016a) generates synthetic parallel corpora through back-translation and learns models from such corpora.…”

Section: Related Workmentioning

confidence: 99%

Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation

Imamura¹,

Fujita²,

Sumita³

2018

Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

View full text Add to dashboard Cite

A large-scale parallel corpus is required to train encoder-decoder neural machine translation. The method of using synthetic parallel texts, in which target monolingual corpora are automatically translated into source sentences, is effective in improving the decoder, but is unreliable for enhancing the encoder. In this paper, we propose a method that enhances the encoder and attention using target monolingual corpora by generating multiple source sentences via sampling. By using multiple source sentences, diversity close to that of humans is achieved. Our experimental results show that the translation quality is improved by increasing the number of synthetic source sentences for each given target sentence, and quality close to that using a manually created parallel corpus was achieved.

show abstract

Section: Related Workmentioning

confidence: 99%

Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation

Imamura¹,

Fujita²,

Sumita³

2018

Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

View full text Add to dashboard Cite

show abstract

“…However, training with respect to the new loss is often computationally intensive and requires approximations. Alternatively, multi-task learning has been used to incorporate source-side (Zhang and Zong, 2016) and target-side (Domhan and Hieber, 2017) monolingual data. Another way of utilizing monolingual data in both source and target language is to warm start Seq2Seq training from pre-trained encoder and decoder networks (Ramachandran et al, 2017;Skorokhodov et al, 2018).…”

Section: Other Approachesmentioning

confidence: 99%

Simple Fusion: Return of the Language Model

Stahlberg¹,

Cross²,

Stoyanov³

2018

Proceedings of the Third Conference on Machine Translation: Research Papers

View full text Add to dashboard Cite

Neural Machine Translation (NMT) typically leverages monolingual data in training through backtranslation. We investigate an alternative simple method to use monolingual data for NMT training: We combine the scores of a pre-trained and fixed language model (LM) with the scores of a translation model (TM) while the TM is trained from scratch. To achieve that, we train the translation model to predict the residual probability of the training data added to the prediction of the LM. This enables the TM to focus its capacity on modeling the source sentence since it can rely on the LM for fluency. We show that our method outperforms previous approaches to integrate LMs into NMT while the architecture is simpler as it does not require gating networks to balance TM and LM. We observe gains of between +0.24 and +2.36 BLEU on all four test sets (English-Turkish, Turkish-English, Estonian-English, Xhosa-English) on top of ensembles without LM. We compare our method with alternative ways to utilize monolingual data such as backtranslation, shallow fusion, and cold fusion.

show abstract

“…Expoliting monolingual data for nmt Monolingual data play a key role in neural machine translation systems, previous work have considered training a seperate language model on the target side (Jean et al, 2014;Gulcehre et al, 2015;Domhan and Hieber, 2017). Rather than using explicit language model, Cheng et al (2016) introduced an auto-encoder-based approach, in which the source-to-target and target-to-source translation models act as encoder and decoder respectively.…”

Section: Related Workmentioning

confidence: 99%

Learning to Actively Learn Neural Machine Translation

Liu¹,

Buntine²,

Haffari³

2018

Proceedings of the 22nd Conference on Computational Natural Language Learning

View full text Add to dashboard Cite

Traditional active learning (AL) methods for machine translation (MT) rely on heuristics. However, these heuristics are limited when the characteristics of the MT problem change due to e.g. the language pair or the amount of the initial bitext. In this paper, we present a framework to learn sentence selection strategies for neural MT. We train the AL query strategy using a high-resource language-pair based on AL simulations, and then transfer it to the lowresource language-pair of interest. The learned query strategy capitalizes on the shared characteristics between the language pairs to make an effective use of the AL budget. Our experiments on three language-pairs confirms that our method is more effective than strong heuristic-based methods in various conditions, including cold-start and warm-start as well as small and extremely small data conditions.

show abstract

Using Target-side Monolingual Data for Neural Machine Translation through Multi-task Learning

Cited by 71 publications

References 11 publications

Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation

Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation

Simple Fusion: Return of the Language Model

Learning to Actively Learn Neural Machine Translation

Contact Info

Product

Resources

About