Arabic is a broadly utilized alphabetic composition framework on the planet, and it has 28 essential letters. The letters in order was first used to compose messages in Arabic, most prominently the Qur'an the holy book of Islam. However, Arabic language has diacritics in the word or letters which are not something extra or discretionary to the language, rather they are a vital piece of it. By changing some diacritics may change both the syntax and semantics of a word by turning a word into another. However, the current researches address the foreground image and consider the diacritics as noises or secondary images. Thus, it is not suitable for Arabic handwritten. The diacritics will be removed from the image and this will lead to losing some good features. Furthermore, to extract the diacritics, the region-based segmentation technique is used. The image will be measured based on the region properties by first finding the connected component in binary image, and then we will determine the best area range measurement in that region for each image. The proposed technique region based has been tested in nine different images with different handwritten style, and successfully extracted secondary foreground images (diacritics) for each image.
<span lang="EN-US">In <span>recent Arabic standard language and Arabic dialectal texts, diacritics and short vowels are absent. There are some exceptions have been made for the Arabic beginner learner scripts, religious texts and as well as a significant political text. In addition, the text without diacritics is considered ambiguous due to numerous words with different diacritic marks seem identical. However, this paper we present a framework for segmenting diacritics from Arabic handwritten document by using region-based segmentation technique. Since Arabic handwritten and Mushaf Al-Quran contain many diacritical marks. Hence, the diacritics must be properly extracted from Arabic handwritten document to avoid losing some good features. Furthermore, the proposed framework is devised specifically to segment diacritics from Arabic handwritten image, thus there will be no feature extraction, feature selection, and classification processes included. Besides, we will present the methodology that is used to fulfil the objectives of this paper. The pre-processing phases will be explained and more specifically segmentation phase for segmenting diacritics which is the phase we concentrate more in this article. Lastly, we will identify the proposed technique region-based segmentation to facilitate our development throughout the experimental process.</span></span>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.