2021
DOI: 10.18287/2412-6179-co-759
|View full text |Cite
|
Sign up to set email alerts
|

Optimal affine image normalization approach for optical character recognition

Abstract: Optical character recognition (OCR) in images captured from arbitrary angles requires preliminary normalization, i.e. a geometric transformation resulting in an image as if it was captured at an angle suitable for OCR. In most cases, a surface containing characters can be considered flat, and a pinhole model can be adopted for a camera. Thus, in theory, the normalization should be projective. Usually, the camera optical axis is approximately perpendicular to the document surface, so the projective normalizatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 26 publications
0
3
0
Order By: Relevance
“…According to the traditional pattern recognition method, the original data must be preprocessed and manually selected and extracted before classification, and it is often the most difficult to determine the best classification features of the original data. If you ignore this step and directly input each pixel of the original image as a feature point, when the input image is too large, the data to be processed is unprecedentedly huge, which will lead to dimensional disaster, which is obviously not feasible in reality [9][10]. Moreover, various characters in natural scenes, especially HCs, have relatively large changes.…”
Section: Cnn Theorymentioning
confidence: 99%
“…According to the traditional pattern recognition method, the original data must be preprocessed and manually selected and extracted before classification, and it is often the most difficult to determine the best classification features of the original data. If you ignore this step and directly input each pixel of the original image as a feature point, when the input image is too large, the data to be processed is unprecedentedly huge, which will lead to dimensional disaster, which is obviously not feasible in reality [9][10]. Moreover, various characters in natural scenes, especially HCs, have relatively large changes.…”
Section: Cnn Theorymentioning
confidence: 99%
“…Firstly, the invoice image in the dataset is gray-scaled, which preserves the brightness and contrast of the whole image on the basis of retaining the current image information [13].…”
Section: Image Preprocessingmentioning
confidence: 99%
“…The complex Network structure formed by the sequential linking of multiple Artificial neurons that can realize specific prediction function is called Artificial Neural Network (ANN) [13]. The core idea of artificial neural network is to set and organize a linear model with nonlinear activation functions (such as Sigmoid, TANh, Relu function, etc.)…”
Section: Figure 1 Schematic Diagram Of Artificial Neuronmentioning
confidence: 99%