Proceedings of the Third Conference on Machine Translation: Shared Task Papers 2018
DOI: 10.18653/v1/w18-6426
|View full text |Cite
|
Sign up to set email alerts
|

The RWTH Aachen University Supervised Machine Translation Systems for WMT 2018

Abstract: This paper describes the statistical machine translation systems developed at RWTH Aachen University for the German→English, English→Turkish and Chinese→English translation tasks of the EMNLP 2018 Third Conference on Machine Translation (WMT 2018). We use ensembles of neural machine translation systems based on the Transformer architecture. Our main focus is on the German→English task where we scored first with respect to all automatic metrics provided by the organizers. We identify data selection, fine-tuning… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 12 publications
(18 citation statements)
references
References 12 publications
0
17
0
Order By: Relevance
“…We replaced non-overlapping tokens in the German source side with the closest Basque token in the cross-lingual word embedding space. The result is, however, worse than not replacing them; we noticed that this subword-by-subword translation produces many Basque phrases with wrong BPE merges (Kim et al, 2018).…”
Section: Synthetic Data Generationmentioning
confidence: 90%
See 3 more Smart Citations
“…We replaced non-overlapping tokens in the German source side with the closest Basque token in the cross-lingual word embedding space. The result is, however, worse than not replacing them; we noticed that this subword-by-subword translation produces many Basque phrases with wrong BPE merges (Kim et al, 2018).…”
Section: Synthetic Data Generationmentioning
confidence: 90%
“…We achieve this by modifying the source side of the parent training data, artificially changing its word orders with random noises (Figure 3). The noise function includes (Hill et al, 2016;Kim et al, 2018):…”
Section: Artificial Noisesmentioning
confidence: 99%
See 2 more Smart Citations
“…Fine-tuning (Luong and Manning, 2015) is a domain adaptation technique that first trains a model 4 We use Sergey Edunov's addnoise.py script available at https://gist.github.com/edunov/ d67d09a38e75409b8408ed86489645dd until it converges on a training corpus A, and then continues training on a usually much smaller corpus B which is close to the target domain. Similarly to Schamper et al (2018); Koehn et al (2018a), we fine-tune our models on former WMT test sets (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016) to adapt them to the target domain of high-quality news translations. Due to the very small size of corpus B, much care has to be taken to avoid over-fitting.…”
Section: Fine-tuning With Ewc and Checkpoint Averagingmentioning
confidence: 99%