2009 10th International Conference on Document Analysis and Recognition 2009
DOI: 10.1109/icdar.2009.62
|View full text |Cite
|
Sign up to set email alerts
|

An Open Source Tesseract Based Optical Character Recognizer for Bangla Script

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 25 publications
(12 citation statements)
references
References 2 publications
0
11
0
Order By: Relevance
“…17,18 Although the accuracy of the character recognition depends on the image conditions, some studies using Tesseract OCR have reported 70% or higher accuracy for grayscale images. 19,20 Because the text recorded in this study contains only numeric characters, some confusing alphabetical characters and symbols (i.e., o, l, I, and B) were automatically replaced by numeric characters to avoid the recognition failure. The values were collected every 200 ms and the median of five values was collected every second to eliminate any errors due to image lag.…”
Section: C Beam Linearity and Consistencymentioning
confidence: 99%
“…17,18 Although the accuracy of the character recognition depends on the image conditions, some studies using Tesseract OCR have reported 70% or higher accuracy for grayscale images. 19,20 Because the text recorded in this study contains only numeric characters, some confusing alphabetical characters and symbols (i.e., o, l, I, and B) were automatically replaced by numeric characters to avoid the recognition failure. The values were collected every 200 ms and the median of five values was collected every second to eliminate any errors due to image lag.…”
Section: C Beam Linearity and Consistencymentioning
confidence: 99%
“…Future works could also analyze the impact of an automatic correction method based on machine learning. As proposed in [Hasnat et al 2009], some correction methods can be implemented to correct spelling mistakes based on information that has a specific format and predefined rules, such as, date, hour, total amount and etc.…”
Section: Discussionmentioning
confidence: 99%
“…Hasnat et al [Hasnat et al 2009] designed an OCR process software for the Bengali language in combination with the Tesseract library, which was called BanglaOCR. This paper focused mainly on Tesseract training and post-processing techniques.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations