2014
DOI: 10.1007/978-0-85729-859-1
|View full text |Cite
|
Sign up to set email alerts
|

Handbook of Document Image Processing and Recognition

Abstract: Tables and forms are a very common way to organize information in structured documents. Their recognition is fundamental for the recognition of the documents. Indeed, the physical organization of a table or a form gives a lot of information concerning the logical meaning of the content. This chapter presents the different tasks that are related to the recognition of tables and forms and the associated well-known methods and remaining B. Coüasnon ()

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0
1

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
1
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 81 publications
(12 citation statements)
references
References 57 publications
0
11
0
1
Order By: Relevance
“…Example confusion table showing the desired output (GT), the actual output (OCR), the absolute number of occurrences (CNT), and the corresponding percentage with respect to all errors (PERC). Given are the five most frequent errors including substitutions (4), deletions (2,5), and insertions (1,3). output where the line-based OCR results are concatenated in reading order and stored as a text file in two variants, one for each individual page and one for the entire book.…”
Section: Results Generationmentioning
confidence: 99%
See 1 more Smart Citation
“…Example confusion table showing the desired output (GT), the actual output (OCR), the absolute number of occurrences (CNT), and the corresponding percentage with respect to all errors (PERC). Given are the five most frequent errors including substitutions (4), deletions (2,5), and insertions (1,3). output where the line-based OCR results are concatenated in reading order and stored as a text file in two variants, one for each individual page and one for the entire book.…”
Section: Results Generationmentioning
confidence: 99%
“…Books printed before 1501. 1. Preprocessing: First of all, the input images have to be prepared for further processing.…”
Section: Steps Of a Typical Ocr Workflowmentioning
confidence: 99%
“…While Optical Character Recognition (OCR) is regularly considered to be a solved problem [1], gathering the textual content of historical printings using OCR can still be a very challenging and cumbersome task, due to various reasons. Among the problems that need to be addressed for early printings is the often intricate layout containing images, artistic border elements and ornaments, marginal notes, and swash capitals at section beginnings whose positioning is often highly irregular.…”
Section: Introductionmentioning
confidence: 99%
“…Contents of the collections are one of the key elements of usefulness of the collections, but also presentation of the contents for the user is important [2,3]. According to Dengel and Shafait [4] "availability of logical structure facilitates navigation and advanced search inside the document as well as enables better presentation of the document in a possibly restructured format." Possibility to use article structure will also improve further analysis stages of the content, such as topic modeling or any other kind of content analysis.…”
Section: Introductionmentioning
confidence: 99%
“…Several digitized historical newspaper collections have implemented article extraction on their pages. Good examples are for example Italian La Stampa 2 , British Newspaper Archive 3 , and Australian Trove 4 . The historical digital newspaper archive environment of the NLF is based on commercial docWorks 5 software.…”
Section: Introductionmentioning
confidence: 99%