2017
DOI: 10.1162/tacl_a_00065
|View full text |Cite
|
Sign up to set email alerts
|

Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

Abstract: We propose a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages. Our solution requires no changes to the model architecture from a standard NMT system but instead introduces an artificial token at the beginning of the input sentence to specify the required target language. The rest of the model, which includes an encoder, decoder and attention module, remains unchanged and is shared across all languages. Using a shared wordpiece vocabulary, our approa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

18
1,643
2
4

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 1,431 publications
(1,765 citation statements)
references
References 21 publications
18
1,643
2
4
Order By: Relevance
“…Researchers have also developed interactive visual analysis tools for latent spaces [STN∗16, JSL∗17, HG18, LBT∗18, LNH∗18]. Some tools focus on a subset of tasks [STN∗16, LBT∗18] in word embeddings, which we extend and bring to a broader range of latent spaces.…”
Section: Related Workmentioning
confidence: 99%
“…Researchers have also developed interactive visual analysis tools for latent spaces [STN∗16, JSL∗17, HG18, LBT∗18, LNH∗18]. Some tools focus on a subset of tasks [STN∗16, LBT∗18] in word embeddings, which we extend and bring to a broader range of latent spaces.…”
Section: Related Workmentioning
confidence: 99%
“…[57][58][59] Deep learning techniques have recently shown promises in solving complex sequence-to-sequence translation problems in natural languages. 60 Training deep learners to infer sequences of user intentions based on sequences of machine events is an extremely interesting direction. • DLPD as a Cloud Service: The advent of cloud computing offers a new option for conducting data leak detection.…”
Section: Further Research Opportunitiesmentioning
confidence: 99%
“…Luong et al (2015a); Firat et al (2016a) proposed to use O(k) encoders/decoders that are then intermixed to translate between language pairs. Johnson et al (2016) proposed to use a single model and prepend special symbols to the source text to indicate the target language, which has later been extended to other text preprocessing approaches (Ha et al, 2017) as well as languageconditional parameter generation for encoders and decoders of a single model (Platanios et al, 2018). Johnson et al (2016) also show that a single multilingual system could potentially enable zeroshot translation, i.e., it can translate between language pairs not seen in training.…”
Section: Introductionmentioning
confidence: 99%