Finite-state Transducers (FST) can be very efficient to implement inter-dialectal transliteration. We illustrate this on the Hindi and Urdu language pair. FSTs can also be used for translation between surface-close languages. We introduce UIT (universal intermediate transcription) for the same pair on the basis of their common phonetic repository in such a way that it can be extended to other languages like Arabic, Chinese, English, French, etc. We describe a transliteration model based on FST and UIT, and evaluate it on Hindi and Urdu corpora. Native Speakers 2 nd Language Speakers Total
After 3 years of specifying the UNL (Universal Networking Language) language and prototyping deconverters I from more than 12 languages and enconverters for about 4, the UNL project has opened to the community by publishing the specifcations (v2.0) of the UNL language, intended to encode the meaning of NL utterances as semantic hypergraphs and to be used as a "pivot" representation in multilingual information and communication systems. A UNL document is an html document with special tags to delimit the utterances and their rendering in UNL and in all natural languages currently handled. UNL can be viewed as the future "html of the linguistic content". It is only an interface format, leading as well to the reuse of existing NLP components as to the development of original tools in a variety of possible applications, from automatic rough enconversion for information retrieval and information gathering translation to partially interactive enconversion or deconversion for higher quality. We illustrate these points by describing an UNL-French deconverter organized as a specific "localizer" followed by a classical MT transfer and an existing generator.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.