2022
DOI: 10.1007/978-3-031-06555-2_11
|View full text |Cite
|
Sign up to set email alerts
|

Importance of Textlines in Historical Document Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 8 publications
0
2
0
Order By: Relevance
“…Page classification. Page classification of historical documents usually consists in associating each page with a class that describes the period of the document, its place of origin, the script used or the author of the document [39,24]. In our case, the classification task would help to discard pages without any act (cover, blank page, index.…”
Section: Step-by-step Workflowmentioning
confidence: 99%
“…Page classification. Page classification of historical documents usually consists in associating each page with a class that describes the period of the document, its place of origin, the script used or the author of the document [39,24]. In our case, the classification task would help to discard pages without any act (cover, blank page, index.…”
Section: Step-by-step Workflowmentioning
confidence: 99%
“…Although Tesseract's accuracy varies across different datasets, the accuracy of the OCR engine can be significantly improved through image preprocessing techniques [5][6][7]. For instance, studies have shown that Tesseract OCR achieves an F1 score of 0.163 on the Brno Mobile OCR Dataset [8], but through pre-processing, the F1 score can increase up to 0.729 [9]. To evaluate the impact of pre-processing on Tesseract's accuracy, we conducted a preliminary analysis using 560 images of phone screen menus captured in an indoor setup with a mounted camera.…”
Section: Introductionmentioning
confidence: 99%