The RWTH Aachen University English-German and German-English Machine
            Translation System for WMT 2017

Peter, J. Dinesh; Guta, Andreas; Alkhouli, Tamer; Bahar, Parnia; Rosendahl, Jan; Rossenbach, Nick; Graça, Miguel; Ney, Hermann

doi:10.18653/v1/w17-4735

Cited by 4 publications

(4 citation statements)

References 24 publications

(21 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We use a variant of attention weight / fertility feedback (Tu et al, 2016), which is inverse in our case, to use a multiplication instead of a division, for better numerical stability. Our model was derived from the model presented by Peter et al, 2017) and (Bahdanau et al, 2014).…”

Section: Performance Comparisonmentioning

confidence: 99%

RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition

Zeyer

Alkhouli

Ney

2018

Proceedings of ACL 2018, System Demonstrations

Self Cite

View full text Add to dashboard Cite

We compare the fast training and decoding speed of RETURNN of attention models for translation, due to fast CUDA LSTM kernels, and a fast pure Tensor-Flow beam search decoder. We show that a layer-wise pretraining scheme for recurrent attention models gives over 1% BLEU improvement absolute and it allows to train deeper recurrent encoder networks. Promising preliminary results on max. expected BLEU training are presented. We obtain state-of-the-art models trained on the WMT 2017 German↔English translation task. We also present end-to-end model results for speech recognition on the Switchboard task. The flexibility of RETURNN allows a fast research feedback loop to experiment with alternative architectures, and its generality allows to use it on a wide range of applications.

show abstract

Section: Performance Comparisonmentioning

confidence: 99%

RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition

Zeyer

Alkhouli

Ney

2018

Proceedings of ACL 2018, System Demonstrations

Self Cite

View full text Add to dashboard Cite

show abstract

“…We modified the RWTH Aachen translation system as described in (Peter et al, 2017) based on the Blocks framework (van Merriënboer et al, 2015) and Theano (Theano Development Team, 2016) to also work as a recurrent language model. The training data is chosen to be equivalent to the one used in the training of the count-based models.…”

Section: Neural Network Language Modelmentioning

confidence: 99%

The RWTH Aachen University Filtering System for the WMT 2018 Parallel Corpus Filtering Task

Rossenbach¹,

Rosendahl²,

Kim³

et al. 2018

Proceedings of the Third Conference on Machine Translation: Shared Task Papers

Self Cite

View full text Add to dashboard Cite

This paper describes the submission of RWTH Aachen University for the De→En parallel corpus filtering task of the EMNLP 2018 Third Conference on Machine Translation (WMT 2018). We use several rule-based, heuristic methods to preselect sentence pairs. These sentence pairs are scored with count-based and neural systems as language and translation models. In addition to single sentence-pair scoring, we further implement a simple redundancy removing heuristic. Our best performing corpus filtering system relies on recurrent neural language models and translation models based on the transformer architecture. A model trained on 10M randomly sampled tokens reaches a performance of 9.2% BLEU on newstest2018. Using our filtering and ranking techniques we achieve 34.8% BLEU.

show abstract

“…The Transformer model was trained using the standard parallel WMT 2018 data sets (namely Europarl, CommonCrawl, NewsCommentary and Rapid, in total 5.9M sentence pairs) as well as the 4.2M sen- tence pairs of synthetic data created in (Sennrich et al, 2016a). Last year's submission is an ensemble of several carefully crafted models using an RNN-encoder and decoder which was trained on the same data plus 6.9M additional synthetic sentences (Peter et al, 2017). We try 20k and 50k merging operations for BPE and find that 50k performs better by 0.5% to 1.0% BLEU.…”

Section: German→englishmentioning

confidence: 99%

The RWTH Aachen University Supervised Machine Translation Systems for WMT 2018

Schamper¹,

Rosendahl²,

Bahar³

et al. 2018

Proceedings of the Third Conference on Machine Translation: Shared Task Papers

Self Cite

View full text Add to dashboard Cite

This paper describes the statistical machine translation systems developed at RWTH Aachen University for the German→English, English→Turkish and Chinese→English translation tasks of the EMNLP 2018 Third Conference on Machine Translation (WMT 2018). We use ensembles of neural machine translation systems based on the Transformer architecture. Our main focus is on the German→English task where we scored first with respect to all automatic metrics provided by the organizers. We identify data selection, fine-tuning, batch size and model dimension as important hyperparameters. In total we improve by 6.8% BLEU over our last year's submission and by 4.8% BLEU over the winning system of the 2017 German→English task. In English→Turkish task, we show 3.6% BLEU improvement over the last year's winning system. We further report results on the Chinese→English task where we improve 2.2% BLEU on average over our baseline systems but stay behind the 2018 winning systems.

show abstract

The RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017

Cited by 4 publications

References 24 publications

RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition

RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition

The RWTH Aachen University Filtering System for the WMT 2018 Parallel Corpus Filtering Task

The RWTH Aachen University Supervised Machine Translation Systems for WMT 2018

Contact Info

Product

Resources

About