Link to this article: http://journals.cambridge.org/abstract_S1351324905003888How to cite this article: ANDY WAY and NANO GOUGH (2005). Comparing example-based and statistical machine translation. Natural Language Engineering, 11, pp 295-309 AbstractIn previous work (Gough and Way 2004), we showed that our Example-Based Machine Translation (EBMT) system improved with respect to both coverage and quality when seeded with increasing amounts of training data, so that it significantly outperformed the on-line MT system Logomedia according to a wide variety of automatic evaluation metrics. While it is perhaps unsurprising that system performance is correlated with the amount of training data, we address in this paper the question of whether a large-scale, robust EBMT system such as ours can outperform a Statistical Machine Translation (SMT) system. We obtained a large English-French translation memory from Sun Microsystems from which we randomly extracted a near 4K test set. The remaining data was split into three training sets, of roughly 50K, 100K and 200K sentence-pairs in order to measure the effect of increasing the size of the training data on the performance of the two systems. Our main observation is that contrary to perceived wisdom in the field, there appears to be little substance to the claim that SMT systems are guaranteed to outperform EBMT systems when confronted with 'enough' training data. Our tests on a 4.8 million word bitext indicate that while SMT appears to outperform our system for French-English on a number of metrics, for English-French, on all but one automatic evaluation metric, the performance of our EBMT system is superior to the baseline SMT model.
We have developed an example-based machine translation (EBMT) system that uses the World Wide Web for two different purposes: First, we populate the system's memory with translations gathered from rule-based MT systems located on the Web. The source strings input to these systems were extracted automatically from an extremely small subset of the rule types in the Penn-II Treebank. In subsequent stages, the source, target translation pairs obtained are automatically transformed into a series of resources that render the translation process more successful. Despite the fact that the output from on-line MT systems is often faulty, we demonstrate in a number of experiments that when used to seed the memories of an EBMT system, they can in fact prove useful in generating translations of high quality in a robust fashion. In addition, we demonstrate the relative gain of EBMT in comparison to on-line systems. Second, despite the perception that the documents available on the Web are of questionable quality, we demonstrate in contrast that such resources are extremely useful in automatically postediting translation candidates proposed by our system.
This paper presents an extended, harmonised account of our previous work on integrating controlled language data in an Example-based Machine Translation system. Gough and Way in MT Summit pp. 133-140 (2003) focused on controlling the output text in a novel manner, while Gough and Way (9th Workshop of the EAMT, (2004a), pp. 73-81) sought to constrain the input strings according to controlled language specifications. Our original sub-sentential alignment algorithm could deal only with 1:1 matches, but subsequent refinements enabled n:m alignments to be captured. A direct consequence was that we were able to populate the system's databases with more than six times as many potentially useful fragments. Together with two simple novel improvements -correcting a small number of mistranslations in the lexicon, and allowing multiple translations in the lexicon -translation quality improves considerably. We provide detailed automatic and human evaluations of a number of experiments carried out to test the quality of the system. We observe that our system outperforms the rule-based on-line system Logomedia on a range of automatic evaluation metrics, and that the 'best' translation candidate is consistently highly ranked by our system. Finally, we note in a number of tests that the BLEU metric gives objectively different results than other automatic evaluation metrics and a manual evaluation. Despite these conflicting results, we observe a preference for controlling the source data rather than the target translations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.