2016
DOI: 10.1162/coli_a_00241
|View full text |Cite
|
Sign up to set email alerts
|

Optimization for Statistical Machine Translation: A Survey

Abstract: In statistical machine translation (SMT), the optimization of the system parameters to maximize translation accuracy is now a fundamental part of virtually all modern systems. In this article, we survey 12 years of research on optimization for SMT, from the seminal work on discriminative models (Och and Ney 2002) and minimum error rate training (Och 2003), to the most recent advances. Starting with a brief introduction to the fundamentals of SMT systems, we follow by covering a wide variety of optimization alg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
12
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(13 citation statements)
references
References 82 publications
0
12
0
Order By: Relevance
“…Estimating the parameters of a MT system from the rewards can not be done with the usual MT optimization methods: as the reference is not known, it is impossible to score a n-best list as required by methods optimizing a classification criterion such as MERT or MIRA (Neubig and Watanabe, 2016). Moreover, as only one translation hypothesis is scored, methods optimizing a ranking criterion, such as PRO, can also not be used.…”
Section: Optimizing a Mt System From Weak Feedbackmentioning
confidence: 99%
“…Estimating the parameters of a MT system from the rewards can not be done with the usual MT optimization methods: as the reference is not known, it is impossible to score a n-best list as required by methods optimizing a classification criterion such as MERT or MIRA (Neubig and Watanabe, 2016). Moreover, as only one translation hypothesis is scored, methods optimizing a ranking criterion, such as PRO, can also not be used.…”
Section: Optimizing a Mt System From Weak Feedbackmentioning
confidence: 99%
“…Therefore, our work can be classed at the intersection of two research disciplines; The former is the optimization of decoder parameters for machine translation [5,6]. Where, for the majority of decoders, the objective function combines log-linearly a set of translation features to evaluate the translation hypotheses.…”
Section: Related Workmentioning
confidence: 99%
“…In the machine translation community to optimise these weights, the proposed algorithms are largely based on a grid search algorithm [6]. Where the goal is to find the best set of weights which minimise a loss function adapted for the translation process [5].…”
Section: Related Workmentioning
confidence: 99%
“…Conventional MT systems, be they phrase-based, n-gram based, syntax-based or hierarchical, are typically trained in two steps: the first step (training) estimates individual features functions; the second one (tuning) learns to combine these features so as to optimize translation quality, for instance using Minimum Error Rate Training (MERT) (Och, 2003). The limitations of MERT, notably its inability to train feature sets containing more than a dozen of features, have long been reported, and more effective discriminative training procedures have been sought (see (Neubig and Watanabe, 2016) for a recent review).…”
Section: Related Workmentioning
confidence: 99%