Although there is a requirement for English to Indian language translation, there are no proper and enough tools for the translators to use, because of the complexities involved in it. The research in developing Machine translation systems is going on but they are not yet to the stage of product or useful systems development. Major reason is the word level, phrase level and semantic level translation complexities involved because of the different cultures and use of the languages and non availability of good quality Translation Memories. Translation memories are widely used by translation industry because of their ease of use, ability to create glossaries, terminologies and phrasal dictionaries in a domain. It also helps maintain writing styles throughout the series of documents. Also, Human Machine Interface is better for human translators. On the other hand Machine Translation (MT) systems when tuned properly produce better translations for simpler sentences. Complex and compound sentences are difficult for MT to translate solely by machines. Post editing by human is required. This paper proposes integration of these two methods to get the respective advantages of both the systems providing better translation support systems for translators to work faster without compromising quality and help human translator to maintain proper writing styles intact. The concept was validated with a small experiment using a widely used TM tool (Wordfast) and English to Indian Language Rule Based Machine Translation system provided by IIT Kanpur.
Algorithms for morphological analyzers have evolved majorly around words. Since writing styles are changing due to impact of languages on each other, higher version of morphological analyzers are desired for various NLP systems such as Machine Translation, Knowledge Extraction, Information Retrieval, etc. Often word level morphological analyzers adhere to language grammars and knowledge set pertaining to GNP and dictionary. Some algorithms use phrasal dictionaries also. But, impact of languages on each other leads to changes in GNP, grammatical and phrasal usage of words. General morph algorithms cannot deal with impact of such usage of words or phrases. Therefore new generation of morph analyzers are desired to handle cross lingual impact. In this paper, methodology for English language morphological analyzer is proposed for interpretation of phrases and group of words to derive knowledge in Hindi for tourism domain. The methodology, although general, is oriented towards Machine Translation. Proposed methodology is based on creation of knowledge base for morph analyzers using formulations of FST and RTN. Using this methodology, ten categories of phrasal structures in sentences have been identified which when used in MA of RBMT would improve the functional efficiency of MT in producing correct translation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.