Labeling, Cutting, Grouping: An Efficient Text Line Segmentation Method for Medieval Manuscripts

Alberti, Michele; Vögtlin, Lars; Pondenkandath, Vinaychandran; Seuret, Mathias; Ingold, Rolf; Liwicki, Marcus

doi:10.1109/icdar.2019.00194

Cited by 29 publications

(34 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The authors demonstrated its genericity by successfully solving five semantic segmentation tasks on historical documents: page extraction, text line extraction, structure detection, decoration detection and photo detection. Albertini et al [3] have used the DeepDIVA framework [2] to obtain high quality semantic segmentation before extracting text-lines. Alaasam et al [1] have used siamese networks at the patch level for semantic segmentation of challenging historical Arabic manuscripts.…”

Section: Neural Network-based Strategiesmentioning

confidence: 99%

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Tarride

Lemaitre

Coüasnon

et al. 2021

IJDAR

View full text Add to dashboard Cite

This work focuses on the layout analysis of historical handwritten registers, in which local religious ceremonies were recorded. The aim of this work is to delimit each record in these registers. To this end, two approaches are proposed. Firstly, object detection networks are explored, as three state-of-the-art architectures are compared. Further experiments are then conducted on Mask R-CNN, as it yields the best performance. Secondly, we introduce and investigate Deep Syntax, a hybrid system that takes advantages of recurrent patterns to delimit each record, by combining ushaped networks and logical rules. Finally, these two approaches are evaluated on 3708 French records (16-18th centuries), as well as on the Esposalles public database, containing 253 Spanish records (17th century). While both systems perform well on homogeneous documents, we observe a significant drop in performance with Mask R-CNN on heterogeneous documents, especially when trained on a non-representative subset. By contrast, Deep Syntax relies on steady patterns, and is therefore able to process a wider range of documents with less training data. Not only Deep Syntax produces 15% more match configurations and reduces the ZoneMap surface error metric by 30% when both systems are trained on 120 images, but it also outperforms Mask R-CNN when trained on a database three times smaller. As Deep Syntax generalizes better, we believe it can be used in the context of massive document processing, as collecting and annotating a sufficiently large and representative set of training data is not always achievable.

show abstract

Section: Neural Network-based Strategiesmentioning

confidence: 99%

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Tarride

Lemaitre

Coüasnon

et al. 2021

IJDAR

View full text Add to dashboard Cite

show abstract

“…After semantic segmentation, we further segment the pixels classified as main text into individual text columns using seam carving, a well-established technique for text line segmentation in historical document images [13]. In this work, we use the recently introduced seam carving method proposed by Alberti et al [7], which has achieved a strong performance on several medieval manuscript datasets. The result of text column segmentation are tight polygons around the foreground pixels of the individual text columns.…”

Section: Text Column Segmentationmentioning

confidence: 99%

“…Finally, the CCs are clustered with respect to the number of seams to their right in order to form text columns and tight polygons are computed around all main text pixels. For more details on the seam carving method, we refer to [7].…”

Section: Text Column Segmentationmentioning

confidence: 99%

See 1 more Smart Citation

Layout Analysis and Text Column Segmentation for Historical Vietnamese Steles

Scius-Bertrand

Voegtlin

Alberti

et al. 2019

Proceedings of the 5th International Workshop on Historical Document Imaging and Processing

Self Cite

View full text Add to dashboard Cite

Stone engravings in Historical Vietnamese steles allow historians to study the life of common people in the villages. Only recently, a large amount of images of such engravings have become available. For supporting the historians, automatic document analysis systems are needed for reading the ancient Chu Nôm characters that are written in columns from top to bottom. In this paper, we study the problem of layout analysis, which is the first step of automatic reading. Semantic segmentation is applied at pixel-level to find the title, main text, label, and reference number on the page using deep convolutional neural networks. Afterwards, seam carving is used to segment the text columns within the main text. We present baseline results for hundred exemplary pages, discuss error cases, and outline lines of future research.

show abstract

“…We can now process a document with both machine-printed text and handwritten text and then recognize them separately [4,5]. Similar applications can be found in the archiving and processing of historical documents [6,7]. In the field of education, related technologies for examination paper autoscoring have emerged, which greatly reduce burden for teachers and students.…”

Section: Introductionmentioning

confidence: 99%

Separating Chinese Character from Noisy Background Using GAN

Huang

Lin

Chen

et al. 2021

Wireless Communications and Mobile Computing

View full text Add to dashboard Cite

Separating printed or handwritten characters from a noisy background is valuable for many applications including test paper autoscoring. The complex structure of Chinese characters makes it difficult to obtain the goal because of easy loss of fine details and overall structure in reconstructed characters. This paper proposes a method for separating Chinese characters based on generative adversarial network (GAN). We used ESRGAN as the basic network structure and applied dilated convolution and a novel loss function that improve the quality of reconstructed characters. Four popular Chinese fonts (Hei, Song, Kai, and Imitation Song) on real data collection were tested, and the proposed design was compared with other semantic segmentation approaches. The experimental results showed that the proposed method effectively separates Chinese characters from noisy background. In particular, our methods achieve better results in terms of Intersection over Union (IoU) and optical character recognition (OCR) accuracy.

show abstract

Labeling, Cutting, Grouping: An Efficient Text Line Segmentation Method for Medieval Manuscripts

Cited by 29 publications

References 26 publications

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Layout Analysis and Text Column Segmentation for Historical Vietnamese Steles

Separating Chinese Character from Noisy Background Using GAN

Contact Info

Product

Resources

About