Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing 2015
DOI: 10.18653/v1/d15-1165
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison between Count and Neural Network Models Based on Joint Translation and Reordering Sequences

Abstract: We propose a conversion of bilingual sentence pairs and the corresponding word alignments into novel linear sequences.These are joint translation and reordering (JTR) uniquely defined sequences, combining interdepending lexical and alignment dependencies on the word level into a single framework. They are constructed in a simple manner while capturing multiple alignments and empty words. JTR sequences can be used to train a variety of models. We investigate the performances of ngram models with modified Kneser… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 12 publications
(13 citation statements)
references
References 21 publications
0
13
0
Order By: Relevance
“…The seminal work of modeling context with source language information can be traced to bilingual language models [20], [41]. The context of a word is often defined as the words appearing in a fixed-size window context, including target-and source-side context, such as bilingual word pairs [11], minimum translation units [37], bilingual wordbased joint translation and reordering models [42]. However, nearly all of these works used traditional n-gram methods to model context, and was therefore subject to the chal-lenge of data sparsity caused by a mass of bilingual word or phrase pairs [42].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The seminal work of modeling context with source language information can be traced to bilingual language models [20], [41]. The context of a word is often defined as the words appearing in a fixed-size window context, including target-and source-side context, such as bilingual word pairs [11], minimum translation units [37], bilingual wordbased joint translation and reordering models [42]. However, nearly all of these works used traditional n-gram methods to model context, and was therefore subject to the chal-lenge of data sparsity caused by a mass of bilingual word or phrase pairs [42].…”
Section: Discussionmentioning
confidence: 99%
“…The context of a word is often defined as the words appearing in a fixed-size window context, including target-and source-side context, such as bilingual word pairs [11], minimum translation units [37], bilingual wordbased joint translation and reordering models [42]. However, nearly all of these works used traditional n-gram methods to model context, and was therefore subject to the chal-lenge of data sparsity caused by a mass of bilingual word or phrase pairs [42]. This caused difficulties in the estimation of higher-order n-gram models and lacked NN's ability to generalize semantically [1].…”
Section: Discussionmentioning
confidence: 99%
“…The system combines the flexibility of word-level models with the search accuracy of phrase candidates. It incorporates the JTR model (Guta et al, 2015), a language model (LM), a word class language model (wcLM) (Wuebker et al, 2013), phrasal translation probabilities, conditional JTR probabilities on phrase level and additional lexical models for smoothing purposes. The phrases are annotated with word alignments to allow for the application of word-level models.…”
Section: Phrasal Joint Translation and Reordering Systemmentioning
confidence: 99%
“…A 7-gram JTR joint model (Guta et al, 2015), which is responsible for estimating the translation and reordering probabilities, is trained on those. It is estimated with interpolated modified Kneser-Ney smoothing (Chen and Goodman, 1998) using the KenLM toolkit (Heafield et al, 2013).…”
Section: Jtr Modelmentioning
confidence: 99%
“…The ETM overcomes this drawback by operating on single words. Guta et al (2015) propose the conversion of bilingual sentence pairs and word alignments into joint translation and reordering (JTR) sequences. They investigate n-gram models with modified Kneser-Ney smoothing, feed-forward and recurrent neural networks trained on JTR sequences.…”
Section: Previous Workmentioning
confidence: 99%