“…Following Li et al (2016), we define our model in the well-known log-linear framework (Och and Ney, 2002). In our experiments, we use the following standard features: two translation probabilities p(g, c|t) and p(t|g, c), two lexical translation probabilities p lex (g, c|t) and p lex (t|g, c), a language model p(t), a rule penalty, a word penalty, and a distortion function as defined in Galley and Manning (2010).…”