In this paper, we address the problem of the large coverage dictionaries of Arabic language usable both for direct human reading and automatic Natural Language Processing. For these purposes, we propose a normalized and implemented modeling, based on Lexical Markup Framework (LMF-ISO 24613) and Data Registry Category (DCR-ISO 12620), which allows a stable and well-defined interoperability of lexical resources through a unification of the linguistic concepts. Starting from the features of the Arabic language, and due to the fact that a large range of details and refinements need to be described specifically for Arabic, we follow a finely structuring strategy. Besides its richness in morphology, syntax and semantics knowledge, our model includes all the Arabic morphological patterns to generate the inflected forms from a given lemma and highlights the syntactic-semantic relations. In addition, an appropriate codification has been designed for the management of all types of relationships among lexical entries and their related knowledge. According to this model, a dictionary named El Madar 1 has been built and is now publicly available on line. The data are managed by a user-friendly Web-based lexicographical workstation. This work has not been done in isolation, but is the result of a collaborative effort by an international team mainly within the ISO network during a period of eight years.
A. Khemakhem et al.is to merge them in order to obtain a new richer resource. More generally, the exchange remains a difficult (and expensive) issue when nothing has been scheduled for this purpose. To meet this challenge, several projects were conducted such as ACQUILEX (Bogurev et al. These projects led to the emergence of the LMF (Lexical Markup Framework) ISO standard for the lexical structure modeling (ISO 24613) (Francopoulo 2003), (Francopoulo and George 2008) in association with the ISO Data Categories Registry (DCR) 3 following ISO 12620 (Ide and Romary 2004). These standards were designed by a group of sixty ISO experts coming from different cultures, languages and continents. Numerous developments followed in different parts of the world. 4 Unfortunately, the Arabic language did not immediately benefit from the emergence of these standards, although it is spoken by more than 300 million people around the world, and is the official language of more than twenty countries. The language still uses references to different printed dictionaries based on incompatible lexicographical schools. Only few works tried the application of LMF on the Arabic language out, according to previous revisions of this standard. Some developments were made in morphology (Khemakhem, Gargouri and Abdelwahed 2006), (Romary, Salmon-Alt and Francopoulo 2004), (Salmon-Alt, Akrout and Romary 2005) and some studies were conducted in syntax (Loukil, Haddar and Ben Hamadou 2008). However, these works were developed during the drafting of the LMF standard and were not updated according to the ISO validation.Obviously, the situation of the Arabic lexica...