Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Confere 2015
DOI: 10.3115/v1/p15-1166
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Task Learning for Multiple Language Translation

Abstract: In this paper, we investigate the problem of learning a machine translation model that can simultaneously translate sentences from one source language to multiple target languages. Our solution is inspired by the recently proposed neural machine translation model which generalizes machine translation as a sequence learning problem. We extend the neural machine translation to a multi-task learning framework which shares source language representation and separates the modeling of different target language trans… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

3
427
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
3
3

Relationship

0
10

Authors

Journals

citations
Cited by 509 publications
(445 citation statements)
references
References 11 publications
3
427
0
Order By: Relevance
“…Multi-task learning was shown to be effective for a variety of NLP tasks, such as POS tagging, chunking, named entity recognition (Collobert et al, 2011) or sentence compression (Klerke et al, 2016). It has also been used in encoderdecoder architectures, typically for machine translation (Dong et al, 2015;Luong et al, 2016), though so far not with attentional decoders.…”
Section: Related Workmentioning
confidence: 99%
“…Multi-task learning was shown to be effective for a variety of NLP tasks, such as POS tagging, chunking, named entity recognition (Collobert et al, 2011) or sentence compression (Klerke et al, 2016). It has also been used in encoderdecoder architectures, typically for machine translation (Dong et al, 2015;Luong et al, 2016), though so far not with attentional decoders.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, their method requires to train an additional NMT from target language to source language, which may negatively influence the attention model in the decoder network. Dong et al (2015) propose a multi-task learning method for translating one source language into multiple target languages in NMT so that the encoder network can be shared when dealing with several sets of bilingual data. , and Firat et al (2016) further deal with more complicated cases (e.g.…”
Section: Related Workmentioning
confidence: 99%
“…In its simplest form our model exploits a one-to-one NMT architecture: the source English sentence is translated into k candidate foreign sentences and then back-translated into English. Inspired by multi-way machine translation which has shown performance gains over single-pair models (Zoph and Knight, 2016;Dong et al, 2015;Firat et al, 2016a), we also explore an alternative pivoting technique which uses multiple languages rather than a single one. Our model inherits advantages from NMT such as a small memory footprint and conceptually easy decoding (implemented as beam search).…”
Section: Related Workmentioning
confidence: 99%