After several decades of heavy research activity on English stemmers, Arabic morphological analysis techniques have become a popular area of research. The Arabic language is one of the Semitic languages; it exhibits a very systematic but complex morphological structure based on root-pattern schemes. As a consequence, survey of such techniques proves to be more necessary. The aim of this paper is to summarize and organize the information available in the literature in an attempt to motivate researchers to look into these techniques and try to develop more advanced ones. This paper introduces, classifies, and surveys Arabic morphological analysis techniques. Furthermore, conclusions, open areas, and future directions are provided at the end.
The Micro-AIRS System, a microcomputer system for Arabic Information Retrieval, was designed as an experimental system to investigate indexing and retrieval processes for Arabic bibliographic data. A series of experiments were performed using 29 queries against a base of 355 Arabic bibliographic records, covering computer and information science from the bibliographic databank at King Abdulaziz City for Science and Technology. These experiments revealed that using roots and using stems as index terms gives better retrieval results than using words. The root performs as well as or better than the stem at low recall levels and definitely better at high recall levels. Several different binary similarity coefficients were tried: the cosine, Dice, and Jaccard coefficients. All three led to exactly the same document rankings for every query. The experiments were run on an IBM/AT-compatible microcomputer. Micro-AIRS is written in Turbo C, Version 2.0.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.