2014
DOI: 10.1177/0165551514526348
|View full text |Cite
|
Sign up to set email alerts
|

Extracting the roots of Arabic words without removing affixes

Abstract: Most research in Arabic roots extraction focuses on removing affixes from Arabic words. This process adds processing overhead and may remove non-affix letters, which leads to the extraction of incorrect roots. This paper advises a new approach to dealing with this issue by introducing a new algorithm for extracting Arabic words’ roots. The proposed algorithm, which is called the Word Substring Stemming Algorithm, does not remove affixes during the extraction process. Rather, it is based on producing the set of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(13 citation statements)
references
References 10 publications
0
13
0
Order By: Relevance
“…The stemming process involves the extraction of a word root to enhance the classifier accuracy by merging many word forms into one root form [37]. The Arabic language has a composite morphology structure that makes root extraction more complicated and limits the stemming to removing prefixes and suffixes [38].…”
Section: B Preprocessingmentioning
confidence: 99%
See 1 more Smart Citation
“…The stemming process involves the extraction of a word root to enhance the classifier accuracy by merging many word forms into one root form [37]. The Arabic language has a composite morphology structure that makes root extraction more complicated and limits the stemming to removing prefixes and suffixes [38].…”
Section: B Preprocessingmentioning
confidence: 99%
“…However, there are several algorithms can simplify extracting roots. These algorithms follow some rules for removing prefixes and suffixes to produce proper stemming, such as the AlKabi [39], Ghawanmeh [40], Hmeidi [41] , Khoja [42] and WSS-Based algorithms [37]. The Light10 stemmer [43], which is claimed to be the best available stemmer, works by solely removing the initial letter ‫,)و(‬ prefix ‫لل(‬ ‫فال,‬ ‫كال,‬ ‫بال,‬ ‫وال,‬ ‫ال,‬ ), and suffix ( ‫يه,‬ ‫ون,‬ ‫ات,‬ ‫ان,‬ ‫ها,‬ ‫ي‬ ‫ة,‬ ‫ه,‬ ‫ية,‬ ‫يه,‬ ), and this may not result in an accurate root extraction.…”
Section: B Preprocessingmentioning
confidence: 99%
“…The approach adopted can be summarized in three stages. In the first stage, a Region CNN (RCNN) [29] is used to map image objects to Arabic root words by the aid of a transducer based algorithm for Arabic root extraction [30]. After that, stage two uses a word based RNN with LSTM memory cell to generate the most appropriate words for an image in Modern Standard Arabic (MSA).…”
Section: B Image Caption For Arabic Languagementioning
confidence: 99%
“…To achieve this, at any given time when English labels of objects were Figure 4: Our Root-Word based Recurrent Neural Network used in training of the convolution neural network, Arabic root-words of the object were also given as input in the training phase. (Yaseen and Hmeidi, 2014;Yousef et al, 2014) proposed the well-known transducer based algorithm for Arabic root extraction which is used to extract root-words from an Arabic word in the training stage. Given the Arabic influence on root-words and the limited 4 verb prefixes, 12 noun prefixes and 20 common suffixes, the approach is optimized for initial training.…”
Section: Image Fragments To Root-words Using Dnnmentioning
confidence: 99%