The ability to identify the intended meanings of words in context is a central research topic in natural language. Many solutions exist for word sense disambiguation (WSD) in different languages, such as English or French, but research on Arabic WSD remains limited. The main bottleneck is the lack of resources. In this article, we show that it is possible to build a WSD system for the Arabic language thanks to the Arabic WordNet and its connexions to the English Princeton WordNet. Given that the Arabic WordNet does not contain definitions for synsets, we construct a dictionary that maps the Princeton WordNet definitions to the Arabic WordNet. We also create an Arabic evaluation corpus and gold standard. We then exploit this dictionary and evaluation corpus to run and evaluate an adapted Ant Colony algorithm on Arabic text that can use the Lesk similarity measure thanks to definition mapping. The algorithm shows a performance of approximately 80% compared to the random baseline of 78.9 %.
Identification of opinions is a set of techniques which is a part of the natural language processing, especially in the information research area. This consists in developing systems able to extract and explore the opinions existing in corpuses. The presence of important textual mass of Arabic newspapers in an electronic format requires a particular exploration technique. We intend to present in this paper a system of opinions identification, based on the model of Aila Rosà [1], representing the opinion as an object composed of four elements : predicate, source, topic and content. Two properties: polarity and intensity which are inspired from the work of Plantié Mathieu [2] and are added to this model to establish relationships between the different opinions present in the text according to their different degrees of intensity and polarity. In presenting its general architecture, our system uses several techniques such as: XML representation of opinions, semantic expansion of opinions as explained by Nicolas B [3] and finally a statistical representation of the opinions in occurrences matrix format to facilitate the calculation of the similarity between the opinions in the classification phase.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.