Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

Das, Apurba; Roy, Saikat; Bhattacharya, Ujjwal; Parui, Susanta Kumar

doi:10.1109/icpr.2018.8545630

Cited by 76 publications

(43 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Multiple networks were trained on specific sections of the documents [21] to learn region-based high dimensional features later compressed via Principal Component Analysis (PCA). The use of multiple Deep Learning models was also exploited by Das et al by using an ensemble as a meta-classifier [16]. A VGG-16 [41] stack of networks using 5 different classifiers has been proposed, one of them trained on the full document and the others specifically over the header, footer, left body and right body.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Improving Accuracy and Speeding Up Document Image Classification Through Parallel Systems

Ferrando

Domínguez

Torres

et al. 2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

This paper presents a study showing the benefits of the Effi-cientNet models compared with heavier Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process of institutions. We show in the RVL-CDIP dataset that we can improve previous results with a much lighter model and present its transfer learning capabilities on a smaller in-domain dataset such as Tobacco3482. Moreover, we present an ensemble pipeline which is able to boost solely image input by combining image model predictions with the ones generated by BERT model on extracted text by OCR. We also show that the batch size can be effectively increased without hindering its accuracy so that the training process can be sped up by parallelizing throughout multiple GPUs, decreasing the computational time needed. Lastly, we expose the training performance differences between PyTorch and Tensorflow Deep Learning frameworks.

show abstract

Section: Related Workmentioning

confidence: 99%

“…Then, transfer learning was demonstrated to work effectively [1,21] by using a network pre-trained on ImageNet [17]. And latest models have become increasingly heavier (greater number of parameters) [2,16,46] as shown in Table 1, with the speed and computational resources drawback this entails.…”

Section: Introductionmentioning

confidence: 99%

Improving Accuracy and Speeding Up Document Image Classification Through Parallel Systems

Ferrando

Domínguez

Torres

et al. 2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Deep Learning models and applications have been used in tasks such as image classification, [21][22][23] document analysis and text recognition, [24][25][26] natural language processing, [27][28][29] and video analysis [30][31][32] in industries ranging from automated driving to medical devices as shown in Figure 3. In References 35-37, the authors investigated the use of visual information to detect and interpret road signs using hierarchical classifier structures that combine Support Vector Machines (SVM) for image verification and Convolutional Neural Networks (CNN) for final recognition.…”

Section: Applications Of Deep Learningmentioning

confidence: 99%

Deep learning support for intelligent transportation systems

Ibáñez

Contreras-Castillo

Zeadally

2020

Trans Emerging Tel Tech

View full text Add to dashboard Cite

Intelligent Transportation Systems (ITS) help improve the ever-increasing vehicular flow and traffic efficiency in urban traffic to reduce the number of accidents. The generation of massive amounts of data generated by all the digital devices connected to the transportation network enables the creation of datasets to perform an in-depth analysis of the data using deep learning methods. Such methods can help predict traffic performance, automated traffic light management, lane detection, and identifying objects near vehicles to increase the safety and efficiency of ITS. We discuss some of the challenges that need to be solved to achieve seamless integration between ITS and deep learning methods to address issues such as (1) improving traffic flow/transportation logistics, (2) predicting best routes for the transportation of goods, (3) optimal fuel consumption, (4) intelligent environmental conditions perception, (5) traffic speed management, and accident prevention.

show abstract

“…Each layerwise style loss is multiplied by the predefined loss coefficient; if the coefficient is different from 0, we refer to the corresponding layer as an active layer: There are in total five blocks, the first two blocks have two Conv layers, each followed by ReLU and MaxPool layers, the last three have three Conv layers, each followed by ReLU and MaxPool layers. Image taken from (Das et al, 2018).…”

Section: Network Of Steelmentioning

confidence: 99%

Network of Steel: Neural Font Style Transfer from Heavy Metal to Corporate Logos

Ter-Sarkisov

2020

Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods

View full text Add to dashboard Cite

We introduce a method for transferring style from the logos of heavy metal bands onto corporate logos using a VGG16 network. We establish the contribution of different layers and loss coefficients to the learning of style, minimization of artefacts and maintenance of readability of corporate logos. We find layers and loss coefficients that produce a good tradeoff between heavy metal style and corporate logo readability. This is the first step both towards sparse font style transfer and corporate logo decoration using generative networks. Heavy metal and corporate logos are very different artistically, in the way they emphasize emotions and readability, therefore training a model to fuse the two is an interesting problem.

show abstract

Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

Abstract: In this article, a region-based Deep Convolutional

Cited by 76 publications

References 34 publications

Improving Accuracy and Speeding Up Document Image Classification Through Parallel Systems

Improving Accuracy and Speeding Up Document Image Classification Through Parallel Systems

Deep learning support for intelligent transportation systems

Network of Steel: Neural Font Style Transfer from Heavy Metal to Corporate Logos

Contact Info

Product

Resources

About