Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers) 2019
DOI: 10.18653/v1/w19-5205
|View full text |Cite
|
Sign up to set email alerts
|

Generalizing Back-Translation in Neural Machine Translation

Abstract: Back-translation -data augmentation by translating target monolingual data -is a crucial component in modern neural machine translation (NMT). In this work, we reformulate back-translation in the scope of crossentropy optimization of an NMT model, clarifying its underlying mathematical assumptions and approximations beyond its heuristic usage. Our formulation covers broader synthetic data generation schemes, including sampling from a target-to-source NMT model. With this formulation, we point out fundamental p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
1
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 27 publications
(22 citation statements)
references
References 15 publications
0
19
0
Order By: Relevance
“…Back-translation is simple and easy to achieve without modifying the architecture of the machine translation models. Back-translation has been studied in both SMT [111][112][113] and NMT [23,80,110,[114][115][116].…”
Section: Pivot Translationmentioning
confidence: 99%
“…Back-translation is simple and easy to achieve without modifying the architecture of the machine translation models. Back-translation has been studied in both SMT [111][112][113] and NMT [23,80,110,[114][115][116].…”
Section: Pivot Translationmentioning
confidence: 99%
“…Various studies have investigated back-translation to improve the backward model, to select the most suitable generation/decoding methods for generating the synthetic data and to reduce the impact of higher ratio of the synthetic to the authentic bitext. The quality of the models trained using back-translation depends on the quality of the backward model [16,17,22,5,19,29,52]. To improve the quality of the synthetic parallel data, [22] used iterative back-translation -iteratively using the back-translated data to improve both the backward and forward models.…”
Section: Leveraging Monolingual Data For Nmtmentioning
confidence: 99%
“…While a separate target-to-source back-translation model is commonly used, how exactly the artificial source-side texts should be generated largely remains a yet-to-resolve research question. The sampling vs. search comparisons are already made in several works Imamura et al, 2018;Graça et al, 2019), while Wang et al (2019) being a most recent work introducing uncertainty-based confidence estimation as an alternative.…”
Section: Related Workmentioning
confidence: 99%
“…To build a competitive neural translation model, back-translation is a commonly used method (Bojar et al, 2018). Following recent studies, sampling from the back-translation model instead of using beamsearch is a popular alternative to obtain the synthetic source side Imamura et al, 2018;Graça et al, 2019).…”
Section: Smoothing In Back-translationmentioning
confidence: 99%
See 1 more Smart Citation