The recognition of Arabic script and its derivatives such as Urdu, Persian, Pashto etc. is a difficult task due to complexity of this script. Particularly, Urdu text recognition is more difficult due to its Nasta’liq writing style. Nasta’liq writing style inherits complex calligraphic nature, which presents major issues to recognition of Urdu text owing to diagonality in writing, high cursiveness, context sensitivity and overlapping of characters. Therefore, the work done for recognition of Arabic script cannot be directly applied to Urdu recognition. We present Multi-dimensional Long Short Term Memory (MDLSTM) Recurrent Neural Networks with an output layer designed for sequence labeling for recognition of printed Urdu text-lines written in the Nasta’liq writing style. Experiments show that MDLSTM attained a recognition accuracy of 98% for the unconstrained Urdu Nasta’liq printed text, which significantly outperforms the state-of-the-art techniques.
This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.
Optical Character Recognition (OCR) of cursive scripts like Pashto and Urdu is difficult due the presence of complex ligatures and connected writing styles. In this paper, we evaluate and compare different approaches for the recognition of such complex ligatures. The approaches include Hidden Markov Model (HMM), Long Short Term Memory (LSTM) network and Scale Invariant Feature Transform (SIFT). Current state of the art in cursive script assumes constant scale without any rotation, while real world data contain rotation and scale variations. This research aims to evaluate the performance of sequence classifiers like HMM and LSTM and compare their performance with descriptor based classifier like SIFT. In addition, we also assess the performance of these methods against the scale and rotation variations in cursive script ligatures. Moreover, we introduce a database of 480,000 images containing 1000 unique ligatures or sub-words of Pashto. In this database, each ligature has 40 scale and 12 rotation variations. The evaluation results show a significantly improved performance of LSTM over HMM and traditional feature extraction technique such as SIFT.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.