A Comparison between Count and Neural Network Models Based on Joint Translation and Reordering Sequences

Guta, Andreas; Alkhouli, Tamer; Peter, J. Dinesh; Wuebker, Joern; Ney, Hermann

doi:10.18653/v1/d15-1165

Cited by 12 publications

(13 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The seminal work of modeling context with source language information can be traced to bilingual language models [20], [41]. The context of a word is often defined as the words appearing in a fixed-size window context, including target-and source-side context, such as bilingual word pairs [11], minimum translation units [37], bilingual wordbased joint translation and reordering models [42]. However, nearly all of these works used traditional n-gram methods to model context, and was therefore subject to the chal-lenge of data sparsity caused by a mass of bilingual word or phrase pairs [42].…”

Section: Discussionmentioning

confidence: 99%

“…The context of a word is often defined as the words appearing in a fixed-size window context, including target-and source-side context, such as bilingual word pairs [11], minimum translation units [37], bilingual wordbased joint translation and reordering models [42]. However, nearly all of these works used traditional n-gram methods to model context, and was therefore subject to the chal-lenge of data sparsity caused by a mass of bilingual word or phrase pairs [42]. This caused difficulties in the estimation of higher-order n-gram models and lacked NN's ability to generalize semantically [1].…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Syntax-Based Context Representation for Statistical Machine Translation

Chen

Zhao

Yang

2018

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

Learning semantic representation for translation context is beneficial to statistical machine translation (SMT). Previous efforts have focused on implicitly encoding syntactic and semantic knowledge in translation context by neural networks, which are weak in capturing explicit structural syntax information. In this paper, we propose a new neural network with a tree-based convolutional architecture to explicitly learn structural syntax information in translation context, thus improving translation prediction. Specifically, we first convert parallel sentences with source parse trees into syntax-based linear sequences based on a minimum syntax subtree algorithm, and then define a tree-based convolutional network over the linear sequences to learn syntax-based context representation and translation prediction jointly. To verify the effectiveness, the proposed model is integrated into phrase-based SMT. Experiments on large-scale Chinese-to-English and German-to-English translation tasks show that the proposed approach can achieve a substantial and significant improvement over several baseline systems. key words: syntax context representation, tree-based neural network, translation prediction, statistical machine translation † The "spurious" indicates that the source word is aligned to "NULL" of target sentence.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Syntax-Based Context Representation for Statistical Machine Translation

Chen

Zhao

Yang

2018

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…The system combines the flexibility of word-level models with the search accuracy of phrase candidates. It incorporates the JTR model (Guta et al, 2015), a language model (LM), a word class language model (wcLM) (Wuebker et al, 2013), phrasal translation probabilities, conditional JTR probabilities on phrase level and additional lexical models for smoothing purposes. The phrases are annotated with word alignments to allow for the application of word-level models.…”

Section: Phrasal Joint Translation and Reordering Systemmentioning

confidence: 99%

“…A 7-gram JTR joint model (Guta et al, 2015), which is responsible for estimating the translation and reordering probabilities, is trained on those. It is estimated with interpolated modified Kneser-Ney smoothing (Chen and Goodman, 1998) using the KenLM toolkit (Heafield et al, 2013).…”

Section: Jtr Modelmentioning

confidence: 99%

The RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017

Peter¹,

Guta²,

Alkhouli³

et al. 2017

Proceedings of the Second Conference on Machine Translation

Self Cite

View full text Add to dashboard Cite

This paper describes the statistical machine translation system developed at RWTH Aachen University for the English→German and German→English translation tasks of the EMNLP 2017 Second Conference on Machine Translation (WMT 2017). We use ensembles of attention-based neural machine translation system for both directions. We use the provided parallel and synthetic data to train the models. In addition, we also create a phrasal system using joint translation and reordering models in decoding and neural models in rescoring.

show abstract

“…The ETM overcomes this drawback by operating on single words. Guta et al (2015) propose the conversion of bilingual sentence pairs and word alignments into joint translation and reordering (JTR) sequences. They investigate n-gram models with modified Kneser-Ney smoothing, feed-forward and recurrent neural networks trained on JTR sequences.…”

Section: Previous Workmentioning

confidence: 99%

Extended Translation Models in Phrase-based Decoding

Guta

Wuebker

Graça³

et al. 2015

Proceedings of the Tenth Workshop on Statistical Machine Translation

Self Cite

View full text Add to dashboard Cite

We propose a novel extended translation model (ETM) to counteract some problems in phrase-based translation: The lack of translation context when using singleword phrases and uncaptured dependencies beyond phrase boundaries. The ETM operates on word-level and augments the IBM models by an additional bilingual word pair and a reordering operation. Its implementation in a phrase-based decoder introduces translation and reordering dependencies for single-word phrases and dependencies across phrase boundaries. More, the model incorporates an explicit treatment of multiple and empty alignments. Its integration outperforms competitive systems that include lexical and phrase translation models as well as hierarchical reordering models on 4 language pairs significantly by +0.7% BLEU on average. Although simpler and using fewer dependencies, the ETM proves to be on par with 7-gram operation sequence models (Durrani et al., 2013b).

show abstract

A Comparison between Count and Neural Network Models Based on Joint Translation and Reordering Sequences

Cited by 12 publications

References 21 publications

Syntax-Based Context Representation for Statistical Machine Translation

Syntax-Based Context Representation for Statistical Machine Translation

The RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017

Extended Translation Models in Phrase-based Decoding

Contact Info

Product

Resources

About