2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR) 2018
DOI: 10.1109/icfhr-2018.2018.00011
|View full text |Cite
|
Sign up to set email alerts
|

dhSegment: A Generic Deep-Learning Approach for Document Segmentation

Abstract: In recent years there have been multiple successful attempts tackling document processing problems separately by designing task specific hand-tuned strategies. We argue that the diversity of historical document processing tasks prohibits to solve them one at a time and shows a need for designing generic approaches in order to handle the variability of historical series. In this paper, we address multiple tasks simultaneously such as page extraction, baseline extraction, layout analysis or multiple typologies o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
143
0
8

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 136 publications
(168 citation statements)
references
References 18 publications
0
143
0
8
Order By: Relevance
“…In the same idea, [12] also investigated such deep architectures to classify identity documents. [13] goes even further by trying to segment the full layout of a document image into paragraphs, titles, ornaments, images etc. These models focus on extracting strong visual features from the images to classify the documents based on their layout, geometry, colors and shape.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In the same idea, [12] also investigated such deep architectures to classify identity documents. [13] goes even further by trying to segment the full layout of a document image into paragraphs, titles, ornaments, images etc. These models focus on extracting strong visual features from the images to classify the documents based on their layout, geometry, colors and shape.…”
Section: Related Workmentioning
confidence: 99%
“…For industrial-grade applications dealing with user-generated content, such a data augmentation is necessary to alleviate overfitting and reduce the gap between train and actual data. Preprocessing page segmentation and layout analysis tools, such as dhSegment [13] can also bring significant improvements by renormalizing image orientation and cropping the document before sending it to the classifier. Moreover, as we have seen, the post-OCR word embeddings include lots of noisy or completely wrong words that generate OOV errors.…”
Section: Limitationsmentioning
confidence: 99%
“…The learning problem is how to adjust the HMM parameters (a ij , b i (x), c jg , µ jg and Σ jg ), so that a given set of observations (called training set) is generated by the model with maximum likelihood. The Baum-Welch algorithm [20] (also known as Forward-Backward algorithm), is used to find these unknown parameters. It is an expectation-maximization (EM) algorithm.…”
Section: The Learning Problem and The Baum-welch Algorithmmentioning
confidence: 99%
“…Throughout the years several surveys have been performed [8,13,14,19,20,24] to compile this work and classify the underlying strategies used to tackle this task. I do not intend to provide a survey as complex and detailed as the aforementioned articles.…”
Section: State Of Thementioning
confidence: 99%
See 1 more Smart Citation