Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017
DOI: 10.18653/v1/d17-1158
|View full text |Cite
|
Sign up to set email alerts
|

Using Target-side Monolingual Data for Neural Machine Translation through Multi-task Learning

Abstract: The performance of Neural Machine Translation (NMT) models relies heavily on the availability of sufficient amounts of parallel data, and an efficient and effective way of leveraging the vastly available amounts of monolingual data has yet to be found. We propose to modify the decoder in a neural sequence-to-sequence model to enable multi-task learning for two strongly related tasks: target-side language modeling and translation. The decoder predicts the next target word through two channels, a target-side lan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
69
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3
3

Relationship

0
10

Authors

Journals

citations
Cited by 71 publications
(71 citation statements)
references
References 11 publications
(12 reference statements)
0
69
0
Order By: Relevance
“…One approach of using target monolingual corpora is to construct a recurrent neural network language model and combine the model with the decoder (Gülçehere et al, 2015;Sriram et al, 2017). Similarly, there is a method of training language models, jointly with the translator, using multitask learning (Domhan and Hieber, 2017 Another approach of using monolingual corpora of the target language is to learn models using synthetic parallel sentences. The method of Sennrich et al (2016a) generates synthetic parallel corpora through back-translation and learns models from such corpora.…”
Section: Related Workmentioning
confidence: 99%
“…One approach of using target monolingual corpora is to construct a recurrent neural network language model and combine the model with the decoder (Gülçehere et al, 2015;Sriram et al, 2017). Similarly, there is a method of training language models, jointly with the translator, using multitask learning (Domhan and Hieber, 2017 Another approach of using monolingual corpora of the target language is to learn models using synthetic parallel sentences. The method of Sennrich et al (2016a) generates synthetic parallel corpora through back-translation and learns models from such corpora.…”
Section: Related Workmentioning
confidence: 99%
“…However, training with respect to the new loss is often computationally intensive and requires approximations. Alternatively, multi-task learning has been used to incorporate source-side (Zhang and Zong, 2016) and target-side (Domhan and Hieber, 2017) monolingual data. Another way of utilizing monolingual data in both source and target language is to warm start Seq2Seq training from pre-trained encoder and decoder networks (Ramachandran et al, 2017;Skorokhodov et al, 2018).…”
Section: Other Approachesmentioning
confidence: 99%
“…Expoliting monolingual data for nmt Monolingual data play a key role in neural machine translation systems, previous work have considered training a seperate language model on the target side (Jean et al, 2014;Gulcehre et al, 2015;Domhan and Hieber, 2017). Rather than using explicit language model, Cheng et al (2016) introduced an auto-encoder-based approach, in which the source-to-target and target-to-source translation models act as encoder and decoder respectively.…”
Section: Related Workmentioning
confidence: 99%