Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks

Naz, Saeeda; Umar, Arif Iqbal; Ahmed, Riaz; Razzak, Muhammad Imran; Rashid, Sheikh Faisal; Shafait, Faisal

doi:10.1186/s40064-016-3442-4

Cited by 44 publications

(23 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In some cases, the main body and dots are separately recognized [127] to reduce the total number of unique classes which can be very high (Urdu, for example, has more than 26,000 unique ligatures [128]). The implicit segmentationbased recognition techniques mostly employ different variants of LSTMs [121,129,130] with a connectionist temporal classification (CTC) output layer to recognize characters. A significant proportion of studies targeting recognition of Urdu text employ the publicly available UPTI [131] and CLE [132] datasets.…”

Section: Text Recognitionmentioning

confidence: 99%

Detection and recognition of cursive text from video frames

Mirza

Zeshan

Atif

et al. 2020

J Image Video Proc.

View full text Add to dashboard Cite

Textual content appearing in videos represents an interesting index for semantic retrieval of videos (from archives), generation of alerts (live streams), as well as high level applications like opinion mining and content summarization. The key components of such systems require detection and recognition of textual content which also make the subject of our study. This paper presents a comprehensive framework for detection and recognition of textual content in video frames. More specifically, we target cursive scripts taking Urdu text as a case study. Detection of textual regions in video frames is carried out by fine-tuning deep neural networks based object detectors for the specific case of text detection. Script of the detected textual content is identified using convoluational neural networks (CNNs), while for recognition, we propose a UrduNet, a combination of CNNs and long short-term memory (LSTM) networks. A benchmark dataset containing cursive text with more than 13,000 video frame is also developed. A comprehensive series of experiments is carried out reporting an F-measure of 88.3% for detection while a recognition rate of 87%.

show abstract

Section: Text Recognitionmentioning

confidence: 99%

Detection and recognition of cursive text from video frames

Mirza

Zeshan

Atif

et al. 2020

J Image Video Proc.

View full text Add to dashboard Cite

show abstract

“…It is suitable for context learning applications. It has been applied on various research tasks in document image analysis specifically relevant to cursive script as reported in [3]- [5] and [9]. The proposed system is investigated by multidimensional LSTM networks because it maintains contextual information and temporarily correlates the new sequences with previous one.…”

Section: B Mdlstm Network Training For Arabic Scene Textmentioning

confidence: 99%

“…The field of text analysis in camera captured images constitute a considerable challenge to address by research community. The work presented in recent years, mostly converged on correct detection of text area in presence of other objects in an image [1], [3], [5]. The scene text can be categorized as a typical OCR problem after text detection and segmentation.…”

Section: Introductionmentioning

confidence: 99%

A Novel Dataset for English-Arabic Scene Text Recognition (EASTR)-42K and Its Evaluation Using Invariant Feature Extraction on Detected Extremal Regions

Ahmed

Naz²,

Razzak

et al. 2019

IEEE Access

Self Cite

View full text Add to dashboard Cite

The recognition of text in natural scene images is a practical yet challenging task due to the large variations in backgrounds, textures, fonts, and illumination. English as a secondary language is extensively used in Gulf countries along with Arabic script. Therefore, this paper introduces English-Arabic scene text recognition 42K scene text image dataset. The dataset includes text images appeared in English and Arabic scripts while maintaining the prime focus on Arabic script. The dataset can be employed for the evaluation of text segmentation and recognition task. To provide an insight to other researchers, experiments have been carried out on the segmentation and classification of Arabic as well as English text and report error rates like 5.99% and 2.48%, respectively. This paper presents a novel technique by using adapted maximally stable extremal region (MSER) technique and extracts scale-invariant features from MSER detected region. To select discriminant and comprehensive features, the size of invariant features is restricted and considered those specific features which exist in the extremal region. The adapted MDLSTM network is presented to tackle the complexities of cursive scene text. The research on Arabic scene text is in its infancy, thus this paper presents benchmark work in the field of text analysis. INDEX TERMS Cursive script, invariant, extremal, MDLSTM.

show abstract

“…The multi-dimensional long short term memory recurrent neural network (MDLSTM RNN) with connectionist temporal classification (CTC) as output layer gives 96.40% Urdu character recognition accuracy tested on 1600 text line images UPTI [27]. The character recognition accuracy is further improved by using a MDLSTM RNN with a matured output layer for sequence labeling giving 98% character recognition accuracy tested on UPTI dataset [28]. The hand crafted features are extracted using Convolutional Neural Networks (CNN) which are fed to MDLSTM for Urdu characters training and recognition.…”

Section: Literature Reviewmentioning

confidence: 99%

“…In Figure 6, AIEN character has different shapes at isolated and final positions. The same character labels of different shapes of a character as presented in [26]- [28] may generate confusions during recognition. However, it is the strength of the sequence learning approach which performs well for the recognition of Nastalique text lines having same labels of multiple contextual character shapes.…”

Section: Literature Reviewmentioning

confidence: 99%

Improving Urdu Recognition Using Character-Based Artistic Features of Nastalique Calligraphy

Akram

Hussain

2019

IEEE Access

View full text Add to dashboard Cite

The state-of-the-art Urdu recognition approaches for Nastalique use features along with the sequence of characters' labels for classification and recognition. In Arabic-like cursive script, the characters are joined together to form a ligature. The conventional methods process the connected stroke of ligatures as a sequence of characters. However, connected stroke of a ligature image has a sequence of pairs of characters and their joiners, instead of a sequence of characters. The character has a distinctive shape that clearly distinguishes it from other characters. The joiner preserves the connecting stroke shape of a character with the next character. In this paper, an implicit Urdu character recognition technique is presented for the Nastalique writing style that is based on recognition of characters and joiners. The detailed analysis of the Nastalique calligraphy is carried out to extract the artistic features of characters and their joiners. The presented technique is tested on Dataset-1 of 1446 ligature classes covering 3 309 762 ligature instances and 91 129 unique Urdu words. In addition, the system is also tested on 1600 text lines of UPTI dataset called Dataset-2. The character recognition accuracies are 95.58% and 98.37% on Dataset-1 and Dataset-2, respectively. The results reveal that the system outperforms the state-of-the-art hidden Markov models and deep learning-based Urdu recognition techniques.

show abstract

Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks

Cited by 44 publications

References 28 publications

Detection and recognition of cursive text from video frames

Detection and recognition of cursive text from video frames

A Novel Dataset for English-Arabic Scene Text Recognition (EASTR)-42K and Its Evaluation Using Invariant Feature Extraction on Detected Extremal Regions

Improving Urdu Recognition Using Character-Based Artistic Features of Nastalique Calligraphy

Contact Info

Product

Resources

About