Indiscapes: Instance Segmentation Networks for Layout Parsing of Historical Indic Manuscripts

Prusty, Abhishek; Aitha, Sowmya; Trivedi, Abhishek; Sarvadevabhatla, Ravi Kiran

doi:10.1109/icdar.2019.00164

Cited by 24 publications

(23 citation statements)

References 36 publications

(41 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These networks can handle different layouts of printed documents, but require many training examples -more than 1,000 documents in these studies. To the best of our knowledge, the only attempt at applying object detection networks on historical documents was done by Prusty et al [55]. They have trained Mask R-CNN on 120 to 350 documents to find instances of different page objects, such as text-lines and page boundaries, in historical Indic manuscripts.…”

Section: Neural Network-based Strategiesmentioning

confidence: 99%

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Tarride

Lemaitre

Coüasnon

et al. 2021

IJDAR

View full text Add to dashboard Cite

This work focuses on the layout analysis of historical handwritten registers, in which local religious ceremonies were recorded. The aim of this work is to delimit each record in these registers. To this end, two approaches are proposed. Firstly, object detection networks are explored, as three state-of-the-art architectures are compared. Further experiments are then conducted on Mask R-CNN, as it yields the best performance. Secondly, we introduce and investigate Deep Syntax, a hybrid system that takes advantages of recurrent patterns to delimit each record, by combining ushaped networks and logical rules. Finally, these two approaches are evaluated on 3708 French records (16-18th centuries), as well as on the Esposalles public database, containing 253 Spanish records (17th century). While both systems perform well on homogeneous documents, we observe a significant drop in performance with Mask R-CNN on heterogeneous documents, especially when trained on a non-representative subset. By contrast, Deep Syntax relies on steady patterns, and is therefore able to process a wider range of documents with less training data. Not only Deep Syntax produces 15% more match configurations and reduces the ZoneMap surface error metric by 30% when both systems are trained on 120 images, but it also outperforms Mask R-CNN when trained on a database three times smaller. As Deep Syntax generalizes better, we believe it can be used in the context of massive document processing, as collecting and annotating a sufficiently large and representative set of training data is not always achievable.

show abstract

Section: Neural Network-based Strategiesmentioning

confidence: 99%

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Tarride

Lemaitre

Coüasnon

et al. 2021

IJDAR

View full text Add to dashboard Cite

show abstract

“…Extant copies of these early manuscripts written in Greek or Latin and usually dating from the 4th century to the 8th century AD, are classified according to their use of either all upper case or all lower case letters. Several researchers addressed the analysis and recognition of these documents; even considering only those published in the ICDAR 2019 proceedings we can count nine papers ([ 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 ]).…”

Section: Historical Documentsmentioning

confidence: 99%

“…Different text lines are located in Indic historical documents in Reference [ 14 ] by using a deep model based on a Mask R-CNN with a ResNet-50 backbone. The different task of instance segmentation (that separates individual objects in the page, e.g., each text line) with respect to semantic segmentation (that aims at identifying pixels belonging to a given object type, e.g., text lines) is taken into account and discussed in the paper.…”

Section: Addressed Problemsmentioning

confidence: 99%

“…Models that comply with the above mentioned features have also been used in the field of historical documents processing and understanding, as summarized in the following. Prusty et al [ 14 ] use FCN for instance segmentation of text lines and other areas in historical documents and a similar approach is used in Reference [ 76 ]. Individual Japanese characters are segmented with FCN in Reference [ 20 ].…”

Section: Neural Architectures and Their Applicationsmentioning

confidence: 99%

“…Applications of object detection models for historical documents range from keyword spotting in early printed works [ 21 ] by using Faster R-CNN, to the location of different text lines in Indic historical documents considering a Mask R-CNN architecture [ 14 ].…”

Section: Neural Architectures and Their Applicationsmentioning

confidence: 99%

See 2 more Smart Citations

Deep Learning for Historical Document Analysis and Recognition—A Survey

Lombardi

Marinai

2020

J. Imaging

View full text Add to dashboard Cite

Nowadays, deep learning methods are employed in a broad range of research fields. The analysis and recognition of historical documents, as we survey in this work, is not an exception. Our study analyzes the papers published in the last few years on this topic from different perspectives: we first provide a pragmatic definition of historical documents from the point of view of the research in the area, then we look at the various sub-tasks addressed in this research. Guided by these tasks, we go through the different input-output relations that are expected from the used deep learning approaches and therefore we accordingly describe the most used models. We also discuss research datasets published in the field and their applications. This analysis shows that the latest research is a leap forward since it is not the simple use of recently proposed algorithms to previous problems, but novel tasks and novel applications of state of the art methods are now considered. Rather than just providing a conclusive picture of the current research in the topic we lastly suggest some potential future trends that can represent a stimulus for innovative research directions.

show abstract

A Robust Method for Text, Line, and Word Segmentation for Historical Arabic Manuscripts

Elharrouss

Al‐Maadeed

Alja’am

et al. 2020

Data Analytics for Cultural Heritage

View full text Add to dashboard Cite

Indiscapes: Instance Segmentation Networks for Layout Parsing of Historical Indic Manuscripts

Cited by 24 publications

References 36 publications

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Deep Learning for Historical Document Analysis and Recognition—A Survey

A Robust Method for Text, Line, and Word Segmentation for Historical Arabic Manuscripts

Contact Info

Product

Resources

About