This paper presents a new approach for resolving lexical ambiguities in one language using statistical data on lexical relations in another language. This approach exploits the differences between mappings of words to senses in different languages. We concentrate on the problem of target word selection in machine translation, for which the approach is directly applicable, and employ a statistical model for the selection mechanism. The model was evaluated using two sets of Hebrew and German examples and was found to be very useful for disambiguation.
'['his paper describes the treatment of n.mina] coin p<+unds in a tranarer based ]uaclnine translation system; it presentt+ a new apprfmeh fc~r resolving amblgnities in co[/li)Olllld segmelltatlotl and COllStitllellt st.rllt:lllre sele(!tim, using a combination .f linguistic rules and statistical data. An introducti~m to the general as well as to the (~erman-English-speeil]c problems oi' (:rpus based techniques.
This paper deals with multiword lexemes (M W Ls), focussing on two types of verbal MWLs: verbal idioms and support verb constructions. We discuss the characteristic properties of MWLs, namely nonstandard compositionality, restricted substitutability o f components, and restricted morpho-syntactic flexibility, and we show how these properties may cause serious problems during the analysis, generation, and transfer steps o f machine translation systems. In order to cope with these problems, M T lexicons need to provide detailed descriptions of MWL properties. We list the types of information which we consider the necessary minimum for a successful processing of MWLs, and report on some feasibility studies aimed at the automatic extraction of German verbal multiword lexemes from text corpora and machine-readable dictionaries.
W-6900 Heidelberg schwall at dhdibm 1.bitnet
ABSTRACTThis paper describes the lexical database tool LOLA (Linguistic-Orientcd lexical database Approach) which has been developed for the construction and maintenance of lexicons for the machine translation system LMT. First, the requirements such a tool should meet are discussed, then LMT and the lexical information it requires, and some issues concerning vocabulary acquisition are presented. Afterwards the architecture and the components of the LOLA system arc described and it is shown how we tried to meet the requirements worked out earlier. Although LOLA originally has been designed and implemented for the German-English LMT prototype, it aimed from the beginning at a representation of lexical data that can be reused for other LMT or MT prototypes or even other NLP applications. A special point of discussion will therefore be the adaptability of the tool and its components as well as the reusability of the lexical data stored in the database for the lexicon development for LMT or for other applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.