Handbook of Document Image Processing and Recognition 2014
DOI: 10.1007/978-0-85729-859-1_6
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of the Logical Layout of Documents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(8 citation statements)
references
References 42 publications
0
8
0
Order By: Relevance
“…In addition, Dengel and Shafait [1] pointed out that understanding books was researched in a different way, by analysing page sections and generating a table of contents to make the digitised books searchable. They also found that all the research data sets they were aware of were not publicly available.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In addition, Dengel and Shafait [1] pointed out that understanding books was researched in a different way, by analysing page sections and generating a table of contents to make the digitised books searchable. They also found that all the research data sets they were aware of were not publicly available.…”
Section: Related Workmentioning
confidence: 99%
“…Dengel and Shafait [1] offered a review of the state of the art, which included six main approaches for logical labelling. Many of them require the existence of additional information like OCR results or document domain knowledge, for example, knowledge about the layout of business letters or invoices.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Some of these have been addressed by the software developed for the Alberti Magni e-corpus project. 8 In particular, after preparing the scholar's texts in a suitable XML-tagged form, a system built on top of sgrep for search and Dragoman for display can address many of those needs. 9 Alternative XML-aware search engines (such as BaseX [20], eXist [21], Wumpus [3], or XQEngine [16]) could equally well have been used in this project, simplifying some solutions but requiring more effort to address other concerns.…”
Section: Retrospective and Further Workmentioning
confidence: 99%
“…The most difficult part of starting a project with a new corpus is to convert the text into XML that reflects its logical structure, an extremely challenging task when physical layout must be interpreted [8], but also quite challenging when the input is plain text with embedded font information. 11 In most of the text, each feature to be tagged can be recognized fairly easily, but unexpected difficulties arise when the features overlap in unanticipated ways.…”
Section: Retrospective and Further Workmentioning
confidence: 99%