2016
DOI: 10.48550/arxiv.1611.04558
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
144
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 73 publications
(146 citation statements)
references
References 0 publications
2
144
0
Order By: Relevance
“…2 The obvious difficulty in creating such an interface is data scarcity in the languages in question. In order to overcome these barriers, we plan to take advantage of recent advances in NLP that allow for multilingual modeling (Täckström et al, 2012;Johnson et al, 2016) and multi-task learning (Caruana, 1997), which allow models to be trained with very little, or even no data in the target language (Neubig and Hu, 2018). We also plan to utilize active learning (Settles, 2009), which specifically asks the linguists to focus on particular examples to maximize the effect of linguists' limited time when working with field data.…”
Section: Overall Frameworkmentioning
confidence: 99%
“…2 The obvious difficulty in creating such an interface is data scarcity in the languages in question. In order to overcome these barriers, we plan to take advantage of recent advances in NLP that allow for multilingual modeling (Täckström et al, 2012;Johnson et al, 2016) and multi-task learning (Caruana, 1997), which allow models to be trained with very little, or even no data in the target language (Neubig and Hu, 2018). We also plan to utilize active learning (Settles, 2009), which specifically asks the linguists to focus on particular examples to maximize the effect of linguists' limited time when working with field data.…”
Section: Overall Frameworkmentioning
confidence: 99%
“…Attention describes the tendency of visual processing to be confined largely to stimuli that are relevant to behavior (addressing the data efficiency). This topic has become an active research in image capationing (Xu et al, 2015), image generation (Gregor et al, 2015), VQA (Xiong et al, 2016), machine translation Johnson et al, 2016b), andspeech recognition (Chorowski et al, 2015). Specifically, Gregor et al (2015) began the early work in small sample learning with the deep recurrent attentive writer (DRAW) neural network architecture for image generation, where attention helped the system to build up an image incrementally, attending to one portion of a "mental canvas" at a time.…”
Section: Knowledge-driven Small Sample Learningmentioning
confidence: 99%
“…The only difference from the standard encoderdecoder architecture with an attention mechanism (Bahdanau et al, 2015) is that in encoding, we concatenate u i−1 and u i−2 , and attach a i to the top of the long sentence as a special word. The technique here is similar to that in zero-shot machine translation (Johnson et al, 2016). Formulation details are given in Appendix.…”
Section: Supervised Learningmentioning
confidence: 99%