2009
DOI: 10.1007/978-3-642-03761-0_17
|View full text |Cite
|
Sign up to set email alerts
|

Book Layout Analysis: TOC Structure Extraction Engine

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2010
2010
2020
2020

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 1 publication
0
7
0
Order By: Relevance
“…Recent algorithms have explored TOC extraction by parsing TOC pages and extract the hierarchical structure of sections and subsections. Most methods in this area have been developed in the context of the INEX [20] and ICDAR competitions [21][22][23] which, as we have mentioned before, focus on long and old digitised historical books, as opposed to short scientific articles with previous methods. To the best of our knowledge, the only work led outside these competitions on the topic of TOC page parsing is [24,25], who apply a rule-based approach to PDF document layout analysis.…”
Section: Toc Extraction Methodsmentioning
confidence: 99%
“…Recent algorithms have explored TOC extraction by parsing TOC pages and extract the hierarchical structure of sections and subsections. Most methods in this area have been developed in the context of the INEX [20] and ICDAR competitions [21][22][23] which, as we have mentioned before, focus on long and old digitised historical books, as opposed to short scientific articles with previous methods. To the best of our knowledge, the only work led outside these competitions on the topic of TOC page parsing is [24,25], who apply a rule-based approach to PDF document layout analysis.…”
Section: Toc Extraction Methodsmentioning
confidence: 99%
“…Most of the approaches of the state of the art rely on the detection of ToC pages within the book, and their detailed analysis for listing all ToC entries and linking them to the corresponding pages. To extract ToC entries and link them to the right page, the most effective technique to date remains the one developed by Dresevic et al [13] which consists in recognizing ToC pages and then processing them so as to extract all ToC entries using a supervised method relying on pattern occurrences from an external training set. However, it is worth noting that the approach of Gander et al [16] performs better for the sole ToC entry extraction (not taking page-linking into account).…”
Section: Approaches Presentedmentioning
confidence: 99%
“…The state-of-the-art approach belongs to this type and is developed by Dresevic et al [5](MDCS). It also recognises TOC pages and assign each physical page with a logical page number.…”
Section: Related Workmentioning
confidence: 99%