2017
DOI: 10.1007/978-3-319-64206-2_27
|View full text |Cite
|
Sign up to set email alerts
|

Neural Machine Translation for Morphologically Rich Languages with Improved Sub-word Units and Synthetic Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
33
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 31 publications
(34 citation statements)
references
References 14 publications
0
33
0
Order By: Relevance
“…Similarly to the method by Pinnis et al (2017b) that allows training NMT models that are more robust to unknown and rarely occurring words, we supplemented the parallel training data with synthetic parallel training sentences. To create the synthetic corpus, we performed word alignment on the parallel corpus using fast-align (Dyer et al, 2013).…”
Section: Synthetic Datamentioning
confidence: 99%
See 2 more Smart Citations
“…Similarly to the method by Pinnis et al (2017b) that allows training NMT models that are more robust to unknown and rarely occurring words, we supplemented the parallel training data with synthetic parallel training sentences. To create the synthetic corpus, we performed word alignment on the parallel corpus using fast-align (Dyer et al, 2013).…”
Section: Synthetic Datamentioning
confidence: 99%
“…Then, the pre-processed sentence is translated with the NMT system. Our NMT models have been trained to leave the unknown word place-holders untranslated, i.e., to pass them through to the target side (Pinnis et al, 2017b). The capability of the NMT system to pass the place-holders through unchanged is vital for the further steps to work.…”
Section: Nmt Only Transl (For Comparison)mentioning
confidence: 99%
See 1 more Smart Citation
“…A problem with back-translation is that model predictions are inevitably erroneous. Translation errors can be propagated to subsequent steps and impair the performance of back-translation, especially whenD b is much larger than D b (Pinnis et al, 2017;Fadaee and Monz, 2018;Poncelas et al, 2018). Therefore, it is crucial to develop principled solutions to enable back-translation to better deal with the error propagation problem.…”
Section: Introductionmentioning
confidence: 99%
“…Nevertheless, more is not always better, as reported by Pinnis et al (2017), where they stated that using some amount of back-translated data gives an improvement, but using double the amount gives lower results, while still better than not using any at all.…”
Section: Filtered Synthetic Training Datamentioning
confidence: 92%