Printed Arabic Script Recognition: A Survey

Alghamdi, Mansoor; Teahan, William J.

doi:10.14569/ijacsa.2018.090953

Cited by 10 publications

(4 citation statements)

References 103 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Alghamdi and Teahan [39] discussed the most commonly used datasets for training and evaluation of OCR systems for printed Arabic script, including the IFN/ENIT Arabic handwritten dataset, the "Handwriting Arabic Corpus" (HAC) dataset, and the RIMES dataset containing a large collection of printed and handwritten documents. The authors provide an overview of the available datasets and emphasize the importance of high-quality datasets for improving the accuracy of OCR systems.…”

Section: Datasetmentioning

confidence: 99%

A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges

et al. 2023

View full text Add to dashboard Cite

Optical character recognition (OCR) is the process of extracting handwritten or printed text from a scanned or printed image and converting it to a machine-readable form for further data processing, such as searching or editing. Automatic text extraction using OCR helps to digitize documents for improved productivity and accessibility and for preservation of historical documents. This paper provides a survey of the current state-of-the-art applications, techniques, and challenges in Arabic OCR. We present the existing methods for each step of the complete OCR process to identify the best-performing approach for improved results. This paper follows the keyword-search method for reviewing the articles related to Arabic OCR, including the backward and forward citations of the article. In addition to state-of-art techniques, this paper identifies research gaps and presents future directions for Arabic OCR.

show abstract

Section: Datasetmentioning

confidence: 99%

A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges

et al. 2023

View full text Add to dashboard Cite

show abstract

“…Additionally, natural languages that use the Arabic writing system extends the base alphabets by adding special diacritics over some characters to better adapt the writing system to the phonemes of the designated language. A thorough discussing about these challenges can be found in [1]. All these characteristics make the recognition of Arabic text a challenging task, especially for the models that depend on segmenting characters prior to the recognition process [2].…”

Section: Challenges Related To Arabic Text Recognitionmentioning

confidence: 99%

“…Further, the convolution process in the model employed zero padding so that it can preserve the size of the input image throughout the convolution process. The pooling process in the initial two layers used a sliding window of size (2x2) while the remaining three layers used a window of size (1,2).…”

Section: Proposed Modelmentioning

confidence: 99%

A Hybrid Deep Learning Model for Arabic Text Recognition

Fasha¹,

Hammo²,

Obeid³

et al. 2020

IJACSA

View full text Add to dashboard Cite

Arabic text recognition is a challenging task because of the cursive nature of Arabic writing system, its joint writing scheme, the large number of ligatures and many other challenges. Deep Learning (DL) models achieved significant progress in numerous domains including computer vision and sequence modelling. This paper presents a model that can recognize Arabic text that was printed using multiple font types including fonts that mimic Arabic handwritten scripts. The proposed model employs a hybrid DL network that can recognize Arabic printed text without the need for character segmentation. The model was tested on a custom dataset comprised of over two million word samples that were generated using (18) different Arabic font types. The objective of the testing process was to assess the model's capability in recognizing a diverse set of Arabic fonts representing a varied cursive styles. The model achieved good results in recognizing characters and words and it also achieved promising results in recognizing characters when it was tested on unseen data. The prepared model, the custom datasets and the toolkit for generating similar datasets are made publically available, these tools can be used to prepare models for recognizing other font types as well as to further extend and enhance the performance of the proposed model.

show abstract

“…As stated in [17], many works were introduced that utilizes fuzzy logic within Arabic OCR applications. In [18], some of these approaches, features are modeled by fuzzy linguistic variables, and fuzzy rules are then used for classification.…”

Section: Related Workmentioning

confidence: 99%

An Enhanced Offline Printed Arabic OCR Model Based on Bio-Inspired Fuzzy Classifier

Darwish

Elzoghaly

2020

IEEE Access

View full text Add to dashboard Cite

In the recent few years, there was a concentrated search on Arabic Optical Character Recognition (OCR), especially the recognition of scanned, offline, machine-printed documents. However, Arabic OCR consequences are dissatisfying and are still a developed research area. Finding the best feature extraction techniques and selecting an appropriate classification algorithm lead to supreme recognition accuracy and low computational overhead. This paper presents a new Arabic OCR model by integrating both of Genetic Algorithm (GA) and the Fuzzy K-Nearest Neighbor classifier (F-KNN) in a unified framework to enhance the identification accuracy. GA is utilized as a feature selection algorithm that has better convergence and spread of solutions with candid variation preservation mechanism. The F-KNN algorithm is more appropriate to classify ambiguous or uncertain data objects in the sense that every object belongs to all classes with different degrees of membership. The suggested model semantically fuses bio-inspired based feature vectors with fuzzy KNN classifier to build accurate membership function for each class. Experimental results compared to other approaches revealed the effectiveness of the suggested model and demonstrated that the feature selection approach increased the identification accuracy process.

show abstract

Printed Arabic Script Recognition: A Survey

Cited by 10 publications

References 103 publications

A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges

A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges

A Hybrid Deep Learning Model for Arabic Text Recognition

An Enhanced Offline Printed Arabic OCR Model Based on Bio-Inspired Fuzzy Classifier

Contact Info

Product

Resources

About