2011 International Conference on Document Analysis and Recognition 2011
DOI: 10.1109/icdar.2011.60
|View full text |Cite
|
Sign up to set email alerts
|

A Semi-supervised Ensemble Learning Approach for Character Labeling with Minimal Human Effort

Abstract: One of the major issues in handwritten character recognition is the efficient creation of ground truth to train and test the different recognizers. The manual labeling of the data by a human expert is a tedious and costly procedure. In this paper we propose an efficient and low-cost semiautomatic labeling system for character datasets. First, the data is represented in different abstraction levels, which is clustered after in an unsupervised manner. The different clusters are labeled by the human experts and f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 26 publications
(19 citation statements)
references
References 9 publications
0
19
0
Order By: Relevance
“…For this experiment, we used the same dataset as descibed in [9], [10] but with a different composition. We did not work on a class-wise manner but rather document-wise.…”
Section: A Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…For this experiment, we used the same dataset as descibed in [9], [10] but with a different composition. We did not work on a class-wise manner but rather document-wise.…”
Section: A Datasetmentioning
confidence: 99%
“…In our previous work, particular research on the Lampung handwritten character recognition has been addressed for semiautomatic labeling [9] and recognition [10]. In the first work, we manually assigned labels to only 0.5% of the training data, the rest of the labels were inferred automatically by the proposed method.…”
Section: Related Workmentioning
confidence: 99%
“…Our work focus therefore on grouping the handwritten scripts into several clusters, and then labeling them manually. A similar offline handwriting annotation system Vajda et al (2011) proposes the idea to label a large number of isolated characters; clustering them into several clusters of characters, and labeling the clusters in order to reduce the human effort. This work shows that over 80% symbol labeling workload have been saved.…”
Section: Reducing Annotation Workloadmentioning
confidence: 99%
“…Our preliminary work (Vajda et al, 2011; Richarz et al, 2014), proposed an analogous scheme, but using much less feature spaces, and an unsupervised clustering mechanism, which relied only on k-means. In this paper, we extended the number of feature spaces considered for unsupervised clustering, and the clustering methods.…”
Section: Related Workmentioning
confidence: 99%
“…In addition, they are evaluated at two levels: the clustering method performance, and the effect of this performance on the classification of the test data set using k-nn. Instead of limiting the input features to the pixel values of the raw images in gray level (Vajda et al, 2011), more sophisticated and lower dimensionality features such as profiles, local binary patterns (Pietikäinen et al, 2011), and Radon transform (Miciak, 2010; Cecotti and Vajda, 2013) were considered to better exploit the advantage of the original method (Vajda et al, 2011). Currently, each image is projected in five different feature spaces.…”
Section: Introductionmentioning
confidence: 99%