2022
DOI: 10.3390/app122211760
|View full text |Cite
|
Sign up to set email alerts
|

Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory

Abstract: Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, the existence of semicircles, dots, oblique, and left-to-right characters such as English words in the context are some of the most important challenges in designing Persian OCR systems. Our pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 49 publications
0
1
0
Order By: Relevance
“…In [11], the authors proposed an Urdu Nastaliq Handwritten Dataset (UNHD), which is written by 500 writers on A4-size paper and is available on request (https://www.kaggle.com/datasets/drsaadbinahmed/ unhd-dataset, accessed on 28 March 2023). Khosrobeigi et al [36] also presented a Persian language dataset; this dataset is collected from different Persian-language new websites, and the description of the dataset is shown in Table 2; this dataset is split into 80% for training and 20% for testing purpose. There are some datasets available that are used for handwritten text recognition of Urdu, and, as we know, Urdu and Arabic use the same vocabulary and alphabet as well.…”
Section: Datasetmentioning
confidence: 99%
“…In [11], the authors proposed an Urdu Nastaliq Handwritten Dataset (UNHD), which is written by 500 writers on A4-size paper and is available on request (https://www.kaggle.com/datasets/drsaadbinahmed/ unhd-dataset, accessed on 28 March 2023). Khosrobeigi et al [36] also presented a Persian language dataset; this dataset is collected from different Persian-language new websites, and the description of the dataset is shown in Table 2; this dataset is split into 80% for training and 20% for testing purpose. There are some datasets available that are used for handwritten text recognition of Urdu, and, as we know, Urdu and Arabic use the same vocabulary and alphabet as well.…”
Section: Datasetmentioning
confidence: 99%