2007
DOI: 10.1007/s10590-006-9008-4
|View full text |Cite
|
Sign up to set email alerts
|

Dependency treelet translation: the convergence of statistical and example-based machine-translation?

Abstract: We describe a novel approach to MT that combines the strengths of the two leading corpus-based approaches: Phrasal SMT and EBMT. We use a syntactically informed decoder and reordering model based on the source dependency tree, in combination with conventional SMT models to incorporate the power of phrasal SMT with the linguistic generality available in a parser. We show that this approach significantly outperforms a leading string-based Phrasal SMT decoder and an EBMT system. We present results from two radica… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2007
2007
2016
2016

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 20 publications
(21 citation statements)
references
References 21 publications
(32 reference statements)
0
21
0
Order By: Relevance
“…Socalled phrasal statistic machine translation systems, which model translations using no more than sequences of contiguous words, perform surprisingly well and require nothing but tokenization in both languages. In language pairs for which we have a source language parser, a parse of the input sentence is used to guide reordering and help select relevant non-contiguous units; this is the treelet system (Quirk and Menezes, 2006). Regardless of which system we use, however, target language models score the fluency of the output, and have a huge positive impact on translation quality.…”
Section: The Next Generation Msr Mt Systemsmentioning
confidence: 99%
“…Socalled phrasal statistic machine translation systems, which model translations using no more than sequences of contiguous words, perform surprisingly well and require nothing but tokenization in both languages. In language pairs for which we have a source language parser, a parse of the input sentence is used to guide reordering and help select relevant non-contiguous units; this is the treelet system (Quirk and Menezes, 2006). Regardless of which system we use, however, target language models score the fluency of the output, and have a huge positive impact on translation quality.…”
Section: The Next Generation Msr Mt Systemsmentioning
confidence: 99%
“…In particular, syntax-based SMT is built implicitly around this assumption (Wu, 1997;Yamada and Knight, 2001). In Quirk and Menezes (2006) DCA is explicitly implemented by defining a translation model in terms of treelet pairs where target-side treelets are produced by projecting source dependencies via word alignments.…”
Section: Direct Correspondence Assumption and Syntactic Cohesion In Smtmentioning
confidence: 99%
“…We also consider a strategy whereby the word remains unconnected to any word in the sentence; see Figure 5(g). 6 R3a and R3b differ from the rules proposed in Quirk and Menezes (2006) dealing with the same situation, since we had to adapt it to the left-to-right parsing scenario. …”
Section: Dependency Graph Projectionmentioning
confidence: 99%
“…Lexical collocation translations were augmented by abstracted templates containing variables (i.e., transduction rules) thereby simultaneously moving in two dimensions toward both compositional and schema-based approaches, as in Kitano and Higuchi (1991), Furuse and Iida (1992), Kaji et al (1992), or Matsumoto et al (1993), and subsequently furthered in work such as Cicekli andGüvenir (1996, 2003), Veale and Way (1997), Brown (1999Brown ( , 2003, McTait and Trujillo (1999), Carl (2003), Yamamoto and Matsumoto (2003), Groves et al (2004), Way and Gough (2005), Cicekli (2005), and Groves and Way (2005b). At the same time, gradually increasing use of probabilities in similarity metrics and in scoring adaptation and composition of hypotheses has also moved EBMT pure EBMT (Nagao 1984, Sumita & Iida 1991, …, Denoual & Lepage 2005 statistical EBMT (Quirk & Menezes 2006, …) template-driven EBMT (Kaji et al 1992, Matsumoto et al 1993, …, Brown 1999, Groves & Way 2005b, …)…”
Section: Mt Model Spacementioning
confidence: 99%
“…Modern EBMT systems incorporate both; for example, Aramaki et al (2005), Langlais and Gotti (2006), Liu et al (2006), and Quirk and Menezes (2006) aim for probabilistic formulations of EBMT in terms of statistical inference.…”
Section: Fig 4 Historical Trajectory Of Development Of Ebmt Modelsmentioning
confidence: 99%