Nano Gough scite author profile

Nano Gough

4Publications

137Citation Statements Received

72Citation Statements Given

How they've been cited

131

How they cite others

Affiliations

Dublin City University

Publications

Order By: Most citations

Comparing example-based and statistical machine translation

Way¹,

Gough²

2005

Nat. Lang. Eng.

View full text Add to dashboard Cite

Link to this article: http://journals.cambridge.org/abstract_S1351324905003888How to cite this article: ANDY WAY and NANO GOUGH (2005). Comparing example-based and statistical machine translation. Natural Language Engineering, 11, pp 295-309 AbstractIn previous work (Gough and Way 2004), we showed that our Example-Based Machine Translation (EBMT) system improved with respect to both coverage and quality when seeded with increasing amounts of training data, so that it significantly outperformed the on-line MT system Logomedia according to a wide variety of automatic evaluation metrics. While it is perhaps unsurprising that system performance is correlated with the amount of training data, we address in this paper the question of whether a large-scale, robust EBMT system such as ours can outperform a Statistical Machine Translation (SMT) system. We obtained a large English-French translation memory from Sun Microsystems from which we randomly extracted a near 4K test set. The remaining data was split into three training sets, of roughly 50K, 100K and 200K sentence-pairs in order to measure the effect of increasing the size of the training data on the performance of the two systems. Our main observation is that contrary to perceived wisdom in the field, there appears to be little substance to the claim that SMT systems are guaranteed to outperform EBMT systems when confronted with 'enough' training data. Our tests on a 4.8 million word bitext indicate that while SMT appears to outperform our system for French-English on a number of metrics, for English-French, on all but one automatic evaluation metric, the performance of our EBMT system is superior to the baseline SMT model.

show abstract

wEBMT: Developing and Validating an Example-Based Machine Translation System Using the World Wide Web

Way

Gough

2003

Computational Linguistics

View full text Add to dashboard Cite

We have developed an example-based machine translation (EBMT) system that uses the World Wide Web for two different purposes: First, we populate the system's memory with translations gathered from rule-based MT systems located on the Web. The source strings input to these systems were extracted automatically from an extremely small subset of the rule types in the Penn-II Treebank. In subsequent stages, the source, target translation pairs obtained are automatically transformed into a series of resources that render the translation process more successful. Despite the fact that the output from on-line MT systems is often faulty, we demonstrate in a number of experiments that when used to seed the memories of an EBMT system, they can in fact prove useful in generating translations of high quality in a robust fashion. In addition, we demonstrate the relative gain of EBMT in comparison to on-line systems. Second, despite the perception that the documents available on the Web are of questionable quality, we demonstrate in contrast that such resources are extremely useful in automatically postediting translation candidates proposed by our system.

show abstract

Example-Based Machine Translation via the Web

Gough

Way

Hearne

2002

View full text Add to dashboard Cite

Controlled Translation in an Example-based Environment: What do Automatic Evaluation Metrics Tell Us?

Way

Gough

2005

Mach Translat

View full text Add to dashboard Cite

This paper presents an extended, harmonised account of our previous work on integrating controlled language data in an Example-based Machine Translation system. Gough and Way in MT Summit pp. 133-140 (2003) focused on controlling the output text in a novel manner, while Gough and Way (9th Workshop of the EAMT, (2004a), pp. 73-81) sought to constrain the input strings according to controlled language specifications. Our original sub-sentential alignment algorithm could deal only with 1:1 matches, but subsequent refinements enabled n:m alignments to be captured. A direct consequence was that we were able to populate the system's databases with more than six times as many potentially useful fragments. Together with two simple novel improvements -correcting a small number of mistranslations in the lexicon, and allowing multiple translations in the lexicon -translation quality improves considerably. We provide detailed automatic and human evaluations of a number of experiments carried out to test the quality of the system. We observe that our system outperforms the rule-based on-line system Logomedia on a range of automatic evaluation metrics, and that the 'best' translation candidate is consistently highly ranked by our system. Finally, we note in a number of tests that the BLEU metric gives objectively different results than other automatic evaluation metrics and a manual evaluation. Despite these conflicting results, we observe a preference for controlling the source data rather than the target translations.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nano Gough

Comparing example-based and statistical machine translation

wEBMT: Developing and Validating an Example-Based Machine Translation System Using the World Wide Web

Example-Based Machine Translation via the Web

Controlled Translation in an Example-based Environment: What do Automatic Evaluation Metrics Tell Us?

Contact Info

Product

Resources

About