Encoding CNN Activations for Writer Recognition

Christlein, Vincent; Maier, Andreas

doi:10.1109/das.2018.9

Cited by 44 publications

(46 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Likewise, the ‘Numberberg’ system is based on exploiting CNN activations to characterise writer and is primarily based on the technique reported in Ref. [54]. Our proposed feature combination outperforms both these techniques indicating that ConvNets may not be able to learn robust feature representations due to the relatively limited amount of training data per class which is the case in most practical applications in the writer identification problem.…”

Section: Resultsmentioning

confidence: 99%

Texture feature column scheme for single‐ and multi‐script writer identification

et al. 2021

View full text Add to dashboard Cite

Identification of writers from images of handwriting is an interesting research problem in the handwriting recognition community. Application of image analysis and machine learning techniques to this problem allows development of computerised solutions which can facilitate forensic experts in reducing the search space against a questioned document. This article investigates the effectiveness of textural measures in characterising the writer of a handwritten document. A novel descriptor by crossing the local binary patterns (LBP) with different configurations that allows capturing the local textural information in handwriting using a column histogram is introduced. The representation is enriched with the oriented Basic Image Features (oBIFs) column histogram. Support vector machine (SVM) is employed as the classifier, and the experimental study is carried out on five different datasets in single as well as multi‐script evaluation scenarios. Multi‐script evaluations allow evaluating the hypothesis that writers share common characteristics across multiple scripts and the reported results validate the effectiveness of textural measures in capturing this script‐independent, writer‐specific information.

show abstract

Section: Resultsmentioning

confidence: 99%

Texture feature column scheme for single‐ and multi‐script writer identification

et al. 2021

View full text Add to dashboard Cite

show abstract

“…They computed LeNet and ResNet based features and reported 99.5% on CVL, 99.6% on ICDAR13, KHATT in Top-1 respectively. In [18], LeNet and ResNet CNN models were used to learn the activation features. The activation features were utilized as local features by encoding with VLAD.…”

Section: Related Workmentioning

confidence: 99%

“…Manual features are language dependent. Automatic features learned by deep neural networks outperformed as compare to handcrafted features [18]- [21]. The Model based automatic features are extracted by deep learning based models automatically from the raw data of images directly.…”

Section: Introductionmentioning

confidence: 99%

Automatic Visual Features for Writer Identification: A Deep Learning Approach

Rehman¹,

Naz²,

Razzak

et al. 2019

IEEE Access

View full text Add to dashboard Cite

Identification of a person from his writing is one of the challenging problems; however, it is not new. No one can repudiate its applications in a number of domains, such as forensic analysis, historical documents, and ancient manuscripts. Deep learning-based approaches have proved as the best feature extractors from massive amounts of heterogeneous data and provide promising and surprising predictions of patterns as compared with traditional approaches. We apply a deep transfer convolutional neural network (CNN) to identify a writer using handwriting text line images in English and Arabic languages. We evaluate different freeze layers of CNN (Conv3, Conv4, Conv5, Fc6, Fc7, and fusion of Fc6 and Fc7) affecting the identification rate of the writer. In this paper, transfer learning is applied as a pioneer study using ImageNet (base data-set) and QUWI data-set (target data-set). To decrease the chance of over-fitting, data augmentation techniques are applied like contours, negatives, and sharpness using textline images of target data-set. The sliding window approach is used to make patches as an input unit to the CNN model. The AlexNet architecture is employed to extract discriminating visual features from multiple representations of image patches generated by enhanced pre-processing techniques. The extracted features from patches are then fed to a support vector machine classifier. We realized the highest accuracy using freeze Conv5 layer up to 92.78% on English, 92.20% on Arabic, and 88.11% on the combination of Arabic and English, respectively. INDEX TERMS Writer identification, visual features, AlexNet, multilingual, support vector machine.

show abstract

“…In comparison, Christlein et al [3] proposed to use an unsupervised learning scheme to compute deep activation features that are eventually encoded using VLAD [26]. In a subsequent work [27], they show that GMP improves the encoding consistently. He et al [28] employ auxiliary tasks to improve writer identification of single word images.…”

Section: B Historical Document Image Classificationmentioning

confidence: 99%

Deep Generalized Max Pooling

Christlein

Spranger

Seuret

et al. 2019

2019 International Conference on Document Analysis and Recognition (ICDAR)

Self Cite

View full text Add to dashboard Cite

Global pooling layers are an essential part of Convolutional Neural Networks (CNN). They are used to aggregate activations of spatial locations to produce a fixed-size vector in several state-of-the-art CNNs. Global average pooling or global max pooling are commonly used for converting convolutional features of variable size images to a fix-sized embedding. However, both pooling layer types are computed spatially independent: each individual activation map is pooled and thus activations of different locations are pooled together. In contrast, we propose Deep Generalized Max Pooling that balances the contribution of all activations of a spatially coherent region by re-weighting all descriptors so that the impact of frequent and rare ones is equalized. We show that this layer is superior to both average and max pooling on the classification of Latin medieval manuscripts (CLAMM'16, CLAMM'17), as well as writer identification (Historical-WI'17).

show abstract

Encoding CNN Activations for Writer Recognition

Cited by 44 publications

References 31 publications

Texture feature column scheme for single‐ and multi‐script writer identification

Texture feature column scheme for single‐ and multi‐script writer identification

Automatic Visual Features for Writer Identification: A Deep Learning Approach

Deep Generalized Max Pooling

Contact Info

Product

Resources

About