Recent work has shown success in using neural network language models (NNLMs) as features in MT systems.Here, we present a novel formulation for a neural network joint model (NNJM), which augments the NNLM with a source context window. Our model is purely lexicalized and can be integrated into any MT decoder. We also present several variations of the NNJM which provide significant additive improvements.Although the model is quite simple, it yields strong empirical results. On the NIST OpenMT12 Arabic-English condition, the NNJM features produce a gain of +3.0 BLEU on top of a powerful, featurerich baseline which already includes a target-only NNLM. The NNJM features also produce a gain of +6.3 BLEU on top of a simpler baseline equivalent to Chiang's (2007) original Hiero implementation.Additionally, we describe two novel techniques for overcoming the historically high cost of using NNLM-style models in MT decoding. These techniques speed up NNJM computation by a factor of 10,000x, making the model as fast as a standard back-off LM.
In this paper, we report on a set of initial results for English-to-Arabic Statistical Machine Translation (SMT). We show that morphological decomposition of the Arabic source is beneficial, especially for smaller-size corpora, and investigate different recombination techniques. We also report on the use of Factored Translation Models for Englishto-Arabic translation.
Syntactic Reordering of the source language to better match the phrase structure of the target language has been shown to improve the performance of phrase-based Statistical Machine Translation. This paper applies syntactic reordering to English-to-Arabic translation. It introduces reordering rules, and motivates them linguistically. It also studies the effect of combining reordering with Arabic morphological segmentation, a preprocessing technique that has been shown to improve Arabic-English and EnglishArabic translation. We report on results in the news text domain, the UN text domain and in the spoken travel domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.