A Fast Fully Octave Convolutional Neural Network for Document Image Segmentation

Neves, Ricardo Batista das; Vercosa, Luiz Felipe; Macêdo, David; Bezerra, Byron Leite Dantas; Zanchettin, Cleber

doi:10.1109/ijcnn48605.2020.9206711

Cited by 15 publications

(16 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The main drawback of this method is the small dataset used, without enough variability, compared to a real life operational scenario. In [33], a method based on UNet was proposed to detect document edges and text regions in Brazilian ID Card images, with a Fully Octave Convolutional Neural Network, which replaces the Convolutional Layers by Octave Convolutional Layers, reducing the redundancy of feature maps and obtaining a lighter model. In the datasets developed, the first one is named CDPhotoDataset with 20,000 images, obtaining an IoU of 0.9916; the second one is named DTDDataset, with 800 real Brazilian documents and after data augmentation a total of 10,000 images, obtaining an IoU of 0.9599.…”

Section: Related Workmentioning

confidence: 99%

Towards an Efficient Semantic Segmentation Method of ID Cards for Verification Systems

Lara¹,

Valenzuela²,

Schulz³

et al. 2021

Preprint

View full text Add to dashboard Cite

Removing the background in ID Card images is a real challenge for remote verification systems because many of the re-digitalised images present cluttered backgrounds, poor illumination conditions, distortion and occlusions. The background in ID Card images confuses the classifiers and the text extraction. Due to the lack of available images for research, this field represents an open problem in computer vision today. This work proposes a method for removing the background using semantic segmentation of ID Cards. In the end, images captured in the wild from real operation, using a manually labelled dataset consisting of 45,007 images, with five types of ID Cards from three countries (Chile, Argentina and Mexico), including typical presentation attack scenarios, were used. This method can help to improve the following stages in a regular identity verification or document tampering detection system. Two Deep Learning approaches were explored, based on MobileUNet and DenseNet10. The best results were obtained using MobileUNet, with 6.5 million parameters. A Chilean ID Card's mean Intersection Over Union (IoU) was 0.9926 on a private test dataset of 4,988 images. The best results for the fused multi-country dataset of ID Card images from Chile, Argentina and Mexico reached an IoU of 0.9911. The proposed methods are lightweight enough to be used in real-time operation on mobile devices.

show abstract

Section: Related Workmentioning

confidence: 99%

Towards an Efficient Semantic Segmentation Method of ID Cards for Verification Systems

Lara¹,

Valenzuela²,

Schulz³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…1st Challenge -Document Boundary Segmentation: The objective of this challenge is to develop boundary detection algorithms for different kinds of documents [21]. The entrants should develop an algorithm that takes as input an image containing a document, and return a new image of the same size with the background in black pixels and the region occupied by the document in white pixels.…”

Section: Challenge Tasksmentioning

confidence: 99%

“…2nd Challenge -Zone Text Segmentation: This challenge encourages the development of algorithms for automatic text detection in ID documents [21]. The entrants have to develop an algorithm capable of detecting text patterns in the provided set of images; that is, to process an image of a document (without background), and return a new image of the same size with non-interest regions in black pixels and regions of interest (text regions) in white pixels.…”

Section: Challenge Tasksmentioning

confidence: 99%

ICDAR 2021 Competition on Components Segmentation Task of Document Photos

Lopes

Neves

Bezerra

et al. 2021

Lecture Notes in Computer Science

View full text Add to dashboard Cite

This paper describes the short-term competition on "Components Segmentation Task of Document Photos" that was prepared in the context of the "16th International Conference on Document Analysis and Recognition" (ICDAR 2021). This competition aims to bring together researchers working on the filed of identification document image processing and provides them a suitable benchmark to compare their techniques on the component segmentation task of document images. Three challenge tasks were proposed entailing different segmentation assignments to be performed on a provided dataset. The collected data are from several types of Brazilian ID documents, whose personal information was conveniently replaced. There were 16 participants whose results obtained for some or all the three tasks show different rates for the adopted metrics, like "Dice Similarity Coefficient" ranging from 0.06 to 0.99. Different Deep Learning models were applied by the entrants with diverse strategies to achieve the best results in each of the tasks. Obtained results show that the current applied methods for solving one of the proposed tasks (document boundary detection) are already well stablished. However, for the other two challenge tasks (text zone and handwritten sign detection) research and development of more robust approaches are still required to achieve acceptable results.

show abstract

“…The processing of images of identification documents has received much attention in the literature. Researchers have presented approaches for identification documents classification [4], automatic handwritten signature segmentation [5], document boundary detection and document text detection [3]. As shown in Figure 1, the proposed algorithm is divided into six main steps, which are detailed below:…”

Section: Related Workmentioning

confidence: 99%

“…Once the organizations have the images of identification documents of their customers, they can execute some algorithms for the automation of the text field extraction tasks [3], document classification [4], signature extraction [5], in addition to other properties and patterns present in the identification documents images.…”

Section: Introductionmentioning

confidence: 99%

BID Dataset: a challenge dataset for document processing tasks

Soares.¹,

Neves²,

Bezerra³

2020

Anais Estendidos Da Conference on Graphics, Patterns and Images (SIBRAPI Estendido 2020)

View full text Add to dashboard Cite

The digital relationship between companies and customers happens through online systems where consumers must upload their identification documents pictures to prove their identities. The existence of this large volume of document images encourages the research development to generate image processing systems to automate tasks usually performed by humans, such as Document Type Classification and Document Reading. The lack of identification documents public datasets delays the research development in document image processing because researchers need to attempt partnerships with private or governmental institutions to obtain the data or build their dataset. In this context, this work presents as main contributions a system to support the automatic creation of identification document public datasets and the Brazilian Identity Document Dataset (BID Dataset): the first Brazilian identification documents public dataset. To accomplish the current personal data privacy law, all information in the BID Dataset comes from fake data. This work aims to increase the velocity of research development in identification document image processing, considering that researchers will be able to use the BID Dataset to develop their research freely.

show abstract

A Fast Fully Octave Convolutional Neural Network for Document Image Segmentation

Cited by 15 publications

References 12 publications

Towards an Efficient Semantic Segmentation Method of ID Cards for Verification Systems

Towards an Efficient Semantic Segmentation Method of ID Cards for Verification Systems

ICDAR 2021 Competition on Components Segmentation Task of Document Photos

BID Dataset: a challenge dataset for document processing tasks

Contact Info

Product

Resources

About