Automatic Discrimination between Printed and Handwritten Text in Documents

Silva, Lincoln Faria da; Conci, Aura; Sánchez, Ángel Mediavilla

doi:10.1109/sibgrapi.2009.40

Cited by 28 publications

(12 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The noise in the data resulted in binarisation failure and missing data in Figure 7-d, while the overlapping line created an occluded segment in Figure 7-c. On the other hand, although the handwritten segments shown in Figure 7-e to -h have close regularities with machine-printed text, they are correctly classified as handwritten since none of the gallery characters can produce a matching score higher than the threshold T . Because of the high similarity with the machine-printed sample, the geometric features, such as area and rectangularity used in [6], [7], [10], [12], fail to create separable classes and the samples will wrongly be classified as machine-printed. Table 1 shows the comparison of our approach with the results provided by Zagoris et al [3], [4], in which 15% of the samples in the PRImA-NHM dataset are utilised for train and the remaining 85% for the test phase.…”

Section: Gallery Creation and The Hmc Resultsmentioning

confidence: 99%

“…The features consist of geometrical, statistical moments, and contours histograms. In [6], eleven features mainly based on the ratio of the statistical and geometrical features of the segmented words to the geometrical size of their bounding boxes (width or height) are extracted from the regions within the bounding boxes.…”

Section: Literature Reviewmentioning

confidence: 99%

“…Quantitative evaluations have been conducted over the publicly available Pattern Recognition & Image Analysis Research LabNatural History Museum (PRImA-NHM) dataset [3], giving compelling results as compared with the state-of-the-art. The main contributions of the proposed HMC algorithm are as follows: 1) As opposed to [4], [5], [6], [7], [8], [9], [10], the proposed algorithm is training-free as does not rely on a trainable classifier. This provides the capability to update the gallery samples, without re-designing or modifying the decision making step.…”

Section: Introductionmentioning

confidence: 99%

“…The application of the proposed gallery generation can be extended to perform HMC for other languages. 2) Unlike the geometry-based feature extraction algorithms used by [6], [7], [10], [11], [12], [13], which are sensitive to missing data, occlusion or cluttered text, the proposed approach is robust against occlusions and noise over the scanned text.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Handwritten and Machine-Printed Text Discrimination Using a Template Matching Approach

Emambakhsh

Nabney

2016

2016 12th IAPR Workshop on Document Analysis Systems (DAS)

View full text Add to dashboard Cite

Section: Gallery Creation and The Hmc Resultsmentioning

confidence: 99%

Section: Literature Reviewmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Handwritten and Machine-Printed Text Discrimination Using a Template Matching Approach

Emambakhsh

Nabney

2016

2016 12th IAPR Workshop on Document Analysis Systems (DAS)

View full text Add to dashboard Cite

“…These methods primarily use the horizontal projection profile, which is obtained by summing pixel values along the horizontal axis. It is achieved by finding its maximum and minimum [5][6]. Each local maximum represents the text line, while local minimum means interline spacing.…”

Section: Introductionmentioning

confidence: 99%

Text Line Segmentation With Water Flow Algorithm Based on Power Function

Brodić

2015

Journal of Electrical Engineering

View full text Add to dashboard Cite

This manuscript proposes an extension to the water flow algorithm for text line segmentation. Basic algorithm assumes hypothetical water flows under few specified angles of the document image frame from left to right and vice versa. As a result, unwetted image regions that incorporate text are extracted. These regions are of the major importance for text line segmentation. The extension of the basic algorithm means modification of water flow function that creates the unwetted region. Hence, the linear water flow function used in the basic algorithm is changed with its power function counterpart. Extended method was tested, examined and evaluated under different text samples. Results are encouraging due to improving text line segmentation which is a key process stage.

show abstract