2019
DOI: 10.1016/j.ins.2019.01.024
|View full text |Cite
|
Sign up to set email alerts
|

Semi-supervised feature learning for improving writer identification

Abstract: Data augmentation is usually used by supervised learning approaches for offline writer identification, but such approaches require extra training data and potentially lead to overfitting errors. In this study, a semi-supervised feature learning pipeline was proposed to improve the performance of writer identification by training with extra unlabeled data and the original labeled data simultaneously. Specifically, we proposed a weighted label smoothing regularization (WLSR) method for data augmentation, which a… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 43 publications
(23 citation statements)
references
References 47 publications
0
23
0
Order By: Relevance
“…CNNs can learn deep abstract and high level features on a small amount of text, such as word images or text blocks which contain several characters. Therefore, most methods extract deep local features on character images and their sub-regions [10] or image patches [11]. These local features are aggregated together for computing the global feature of each handwritten page for writer identification [10], [11].…”
Section: Introductionmentioning
confidence: 99%
“…CNNs can learn deep abstract and high level features on a small amount of text, such as word images or text blocks which contain several characters. Therefore, most methods extract deep local features on character images and their sub-regions [10] or image patches [11]. These local features are aggregated together for computing the global feature of each handwritten page for writer identification [10], [11].…”
Section: Introductionmentioning
confidence: 99%
“…It uses a small amount of labeled process data to train the initial identification model, and then improves the identification performance through a large amount of unlabeled process data. The semi-supervised learning method has been applied to various fields, such as writer identification [10], sentiment classification [11], medical image analysis [12], traffic flow [13], etc. Active learning is a sampling strategy that selects high entropy unlabeled data.…”
Section: Background and Significancementioning
confidence: 99%
“…The process includes 22 continuous measured variables (Table 2) and 20 preset fault modes (Table 3). 3E Feed CMV (14) Separator underflow CMV 4A and C Feed CMV (15) Stripper level CMV (5) Recycle flow CMV (16) Stripper pressure CMV (6) Reactor feed CMV (17) Stripper underflow CMV 7Reactor pressure CMV (18) Stripper temperature CMV (8) Reactor level CMV (19) Stripper steam flow 5Recycle flow CMV (16) Stripper pressure CMV (6) Reactor feed CMV (17) Stripper underflow CMV 7Reactor pressure CMV (18) Stripper temperature CMV (8) Reactor level CMV (19) Stripper steam flow CMV (9) Reactor temperature CMV (20) Compressor work CMV (10) Purge flow CMV (21) Reactor cooling water outlet temperature CMV (11) Separator temperature CMV (22) Condenser cooling water outlet temperature Figure 4 shows that the total variance contribution rate of the first 12 principal components has reached 83.15% (more than 80%), so the first 12 principal components can reflect information about all variables. Figure 5 shows the eigenvalues of the first 12 principal components as well.…”
Section: Process Descriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…These local feature descriptors are subsequently aggregated to form a global embedding and then used for comparison. A semi-supervised learning scheme is suggested by Chen et al [25] which makes use of additional unlabeled data. In comparison, Christlein et al [3] proposed to use an unsupervised learning scheme to compute deep activation features that are eventually encoded using VLAD [26].…”
Section: B Historical Document Image Classificationmentioning
confidence: 99%