This paper describes the Global Tone Communication Co., Ltd.'s submission of the WMT18 shared news translation task. We participated in the English-to-Chinese direction and get the best BLEU (43.8) scores among all the participants. The submitted system focus on data clearing and techniques to build a competitive model for this task. Unlike other participants, the submitted system are mainly relied on the data filtering to obtain the best BLEU score. We do data filtering not only for provided sentences but also for the back translated sentences. The techniques we apply for data filtering include filtering by rules, language models and translation models. We also conduct several experiments to validate the effectiveness of training techniques. According to our experiments, the Annealing Adam optimizing function and ensemble decoding are the most effective techniques for the model training.
This software is complete by Visual C+ + 6.0 and QT, It designed in the Unicode character set patterns , Contribute to ASEAN's cooperation and exchanges, It's solve the problem that system use compatibility and character output garbled in current national language software development. This development model is simple use, stable operation, flexible interface, convenient in user for vocabulary and voice database unified processing (backup, print), at the same time also provides technical guidance to other national language text translation software development. Thai Wen Chinese Translation Electronic dictionary is an important innovation in the field of Dai information technology, providing convenience for "the Belt and Road Initiative" policy. It's the basic support of starting research about minority language cultural information element representation and extraction. And the main function is responsible for Thai queries, translation, reading, etc. Thai WenChinese -English Translation Electronic Dictionary designed to achieve the common functions such as ThaiChinese bilingual translation, Thai people reading and English display. It's also support the thesaurus to add, modify, delete custom actions, it implements the good human-computer interaction function.
This paper describes the Global Tone Communication Co., Ltd.'s submission of the WMT19 shared news translation task. We participate in six directions: English to (Gujarati, Lithuanian and Finnish) and (Gujarati, Lithuanian and Finnish) to English. Further, we get the best BLEU scores in the directions of English to Gujarati and Lithuanian to English (28.2 and 36.3 respectively) among all the participants. The submitted systems mainly focus on backtranslation, knowledge distillation and reranking to build a competitive model for this task. Also, we apply language model to filter monolingual data, back-translated data and parallel data. The techniques we apply for data filtering include filtering by rules, language models. Besides, We conduct several experiments to validate different knowledge distillation techniques and right-to-left (R2L) reranking.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.