2019 International Conference on Document Analysis and Recognition (ICDAR) 2019
DOI: 10.1109/icdar.2019.00190
|View full text |Cite
|
Sign up to set email alerts
|

Improving Text Recognition using Optical and Language Model Writer Adaptation

Abstract: State-of-the-art methods for handwriting text recognition are based on deep learning approaches and language modeling that require large data sets during training. In practice, there are some applications where the system processes mono-writer documents, and would thus benefit from being trained on examples from that writer. However, this is not common to have numerous examples coming from just one writer. In this paper, we propose an approach to adapt both the optical model and the language model to a particu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 27 publications
(37 reference statements)
0
11
0
Order By: Relevance
“…In the HTR problem with a reduced training set, TL was applied by Soullard et al in [7]. The main idea behind TL is initializing the parameters of a model by those learned from a huge dataset beforehand, denoted as source.…”
Section: B Transfer Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…In the HTR problem with a reduced training set, TL was applied by Soullard et al in [7]. The main idea behind TL is initializing the parameters of a model by those learned from a huge dataset beforehand, denoted as source.…”
Section: B Transfer Learningmentioning
confidence: 99%
“…Hence, with TL we start learning a different task to avoid learning the whole set of parameters from scratch, preventing overfitting and favoring convergence. In [7], they proposed a method that applies TL in both the optical and the language model. In this and other similar previous proposals on TL, the authors applied DA in both training and test steps.…”
Section: B Transfer Learningmentioning
confidence: 99%
“…For big historical datasets, in [5], the authors demonstrated the benefits of carefully designed data augmentation. Another strategy is to apply transfer learning [12,13,25,15,1], i.e., pretraining the HTR model on a big HTR dataset and fine-tuning it on the small training set of the dataset of interest. For HTR on small single-writer historical datasets, pretraining plus fine-tuning has been proven to be a more effective strategy than data augmentation [1].…”
Section: Related Workmentioning
confidence: 99%
“…In the HTR problem with a reduced training set, TL was applied by Soullard et al in [7]. The main idea behind TL is initializing the parameters of a model by those learned from a huge dataset, denoted by source.…”
Section: Related Workmentioning
confidence: 99%
“…Also, classification and indexing of transcript text can be easily automated. Handwritten text recognition (HTR) tasks in historical datasets have been faced by many authors in the last few years [1][2][3][4][5][6][7][8][9][10]. In HTR, transcribing each author can be considered a different task, since the distribution of both model input and output varies from writer to writer.…”
Section: Introductionmentioning
confidence: 99%