Abstract:Purpose of this paperThe purpose of this study is to evaluate freely available machine translation (MT) services' performance in translating metadata records.
Design/methodology/approachRandomly selected metadata records were translated from English into Chinese using Google, Bing, and SYSTRAN Machine Translation (MT) systems. These translations were then evaluated using a five point scale for both Fluency and Adequacy. Missing Count (words not translated) and Incorrect Count (words incorrectly translated) wer… Show more
“…This study was one of the very few studies that conducted human evaluation of MT on metadata records. It confirmed the conclusion from the previous study (Chen et al, ) that online MT systems could produce non‐native yet sufficiently good translations that might help information users in many ways overcome language barriers. However, it significantly extended the MT evaluation in our previous study (Chen et al, ) to more languages and a larger sample size.…”
Section: Discussionsupporting
confidence: 89%
“…It confirmed the conclusion from the previous study (Chen et al, ) that online MT systems could produce non‐native yet sufficiently good translations that might help information users in many ways overcome language barriers. However, it significantly extended the MT evaluation in our previous study (Chen et al, ) to more languages and a larger sample size. Especially, we conducted evaluation on different elements and found that some online MT systems, such as Google Translate and Bing Translator, produced translations with high levels of fluency and adequacy for certain metadata elements such as subject and creator.…”
Section: Discussionsupporting
confidence: 89%
“…We doubled the Microsoft Bing translation results in the TM, which was justified by a previous study finding that Microsoft Bing slightly outperformed Google Translate and performed much better than Yahoo! in adequacy and fluency (Chen et al, ), as well as the fact that Microsoft Bing's translations ranked highest of the three in terms of BLEU scores for Chinese (Papineni, Roukos, Ward, & Zhu, ; Chen et al, ). The LM consisted of only the MT systems' output in Chinese or Spanish.…”
Section: Moses For Memtmentioning
confidence: 99%
“…While many online MT systems are available for use, they are not always sufficient for producing quality MT metadata records to facilitate MLIA in digital collections. Chen, Ding, Jiang, and Knudson () experimented with three online MT systems that translated metadata records and found that MT performance of these records could stand improvement.…”
One way to facilitate Multilingual Information Access (MLIA) for digital libraries is to generate multilingual metadata records by applying Machine Translation (MT) techniques. Current online MT services are available and affordable, but are not always effective for creating multilingual metadata records. In this study, we implemented 3 different MT strategies and evaluated their performance when translating English metadata records to Chinese and Spanish. These strategies included combining MT results from 3 online MT systems (Google, Bing, and Yahoo!) with and without additional linguistic resources, such as manually-generated parallel corpora, and metadata records in the two target languages obtained from international partners. The opensource statistical MT platform Moses was applied to design and implement the three translation strategies. Human evaluation of the MT results using adequacy and fluency demonstrated that two of the strategies produced higher quality translations than individual online MT systems for both languages. Especially, adding small, manuallygenerated parallel corpora of metadata records significantly improved translation performance. Our study suggested an effective and efficient MT approach for providing multilingual services for digital collections.
“…This study was one of the very few studies that conducted human evaluation of MT on metadata records. It confirmed the conclusion from the previous study (Chen et al, ) that online MT systems could produce non‐native yet sufficiently good translations that might help information users in many ways overcome language barriers. However, it significantly extended the MT evaluation in our previous study (Chen et al, ) to more languages and a larger sample size.…”
Section: Discussionsupporting
confidence: 89%
“…It confirmed the conclusion from the previous study (Chen et al, ) that online MT systems could produce non‐native yet sufficiently good translations that might help information users in many ways overcome language barriers. However, it significantly extended the MT evaluation in our previous study (Chen et al, ) to more languages and a larger sample size. Especially, we conducted evaluation on different elements and found that some online MT systems, such as Google Translate and Bing Translator, produced translations with high levels of fluency and adequacy for certain metadata elements such as subject and creator.…”
Section: Discussionsupporting
confidence: 89%
“…We doubled the Microsoft Bing translation results in the TM, which was justified by a previous study finding that Microsoft Bing slightly outperformed Google Translate and performed much better than Yahoo! in adequacy and fluency (Chen et al, ), as well as the fact that Microsoft Bing's translations ranked highest of the three in terms of BLEU scores for Chinese (Papineni, Roukos, Ward, & Zhu, ; Chen et al, ). The LM consisted of only the MT systems' output in Chinese or Spanish.…”
Section: Moses For Memtmentioning
confidence: 99%
“…While many online MT systems are available for use, they are not always sufficient for producing quality MT metadata records to facilitate MLIA in digital collections. Chen, Ding, Jiang, and Knudson () experimented with three online MT systems that translated metadata records and found that MT performance of these records could stand improvement.…”
One way to facilitate Multilingual Information Access (MLIA) for digital libraries is to generate multilingual metadata records by applying Machine Translation (MT) techniques. Current online MT services are available and affordable, but are not always effective for creating multilingual metadata records. In this study, we implemented 3 different MT strategies and evaluated their performance when translating English metadata records to Chinese and Spanish. These strategies included combining MT results from 3 online MT systems (Google, Bing, and Yahoo!) with and without additional linguistic resources, such as manually-generated parallel corpora, and metadata records in the two target languages obtained from international partners. The opensource statistical MT platform Moses was applied to design and implement the three translation strategies. Human evaluation of the MT results using adequacy and fluency demonstrated that two of the strategies produced higher quality translations than individual online MT systems for both languages. Especially, adding small, manuallygenerated parallel corpora of metadata records significantly improved translation performance. Our study suggested an effective and efficient MT approach for providing multilingual services for digital collections.
“…The evaluation of different MT services or open source systems for translation of scholarly articles would be useful for international scholars. There is some related research on evaluating MT systems for translation of metadata in digital libraries that would be worth discussing in this context (Chen et al, ; Reyes Ayala et al, ). This gap is partially justified by the complexity of those topics and the limited research on the multilingual aspects of scholarly communication.…”
Machine translation (MT) has become ubiquitous in business and consumer applications in recent years. The quality of automatically translated text has substantially improved, although it is still not as accurate and reliable as translation done by professional translators. Casual users are willing to accept some imperfections just to get the gist of information or to communicate while traveling in a foreign country. Using MT for translation of scholarly articles where accuracy of reporting scientific findings and quality of writing are critical, is more challenging. The authors of Machine Translation and Global Research make a convincing argument that MT can play an important role in scholarly communication, not only in expanding access to research articles written in other languages, but also leveling the playing field for scholars whose English is not a first language. While acknowledging the limitations of MT in generating publication-quality translations, the authors explore the ways this technology can be effective in discovery and assimilation of scientific information for global research. Furthermore, they offer practical guidelines for translation-friendly academic writing and propose a framework of machine translation literacy instruction. Scholarly communication is a global phenomenon, with English as a dominant language of scientific publications and conference presentations. The analysis of language coverage for journals indexed in two major citation databases, Web of Science and Scopus, demonstrates that English is overwhelmingly represented in both databases and across disciplines (Mongeon & Paul-Hus, 2016). The dominance of English in scholarly discourse is advantageous for native speakers of English. However, scholars from non-English-speaking countries face many obstacles in disseminating their findings and having their manuscripts accepted for publication in leading international journals. Research indicates that some language bias is present in the peer-review process (Lee et al., 2013).
We demonstrate HeMT, a multilingual Web system for human evaluation of machine translated metadata records. It allows human evaluators to examine and assess machine translation results for sample metadata records in Chinese, English, and Spanish. This paper describes the design principles, users, and the functions of the system. It also presents the research design of a small-scale usability testing that will not only examine the appearance of the Website, but also the accuracy of the content and the Website's cultural appropriateness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.