Deep Reader: Information Extraction from Document Images via Relation Extraction and Natural Language

Vishwanath, Dhanraj; Rahul, Rohit; Sehgal, Gunjan; Swati, -; Chowdhury, A. R.; Sharma, Monika; Vig, Lovekesh; Shroff, Gautam; Srinivasan, Ashwin

doi:10.1007/978-3-030-21074-8_15

Cited by 6 publications

(6 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Discriminative approaches for document image cleanup include a CNN-based approach for deblurring [20], a U-net [19] based approach replacing the skip connections between the encoder and decoder blocks with alternating convolutional and recurrent layers for efficient feature extraction [17], a two-stage CNN-based approach where the first stage is to classify the type of deblurring and the second stage to remove it [10], and conditional GANs (cGANs) [25,23], which is a supervised image-to-image translation approach [9]. DE-GAN [23], particularly, is recently proposed based on cGANs with a modified loss function with promising results on binarization, deblurring, and watermark removal in documents.…”

Section: Image Denoising In Documentsmentioning

confidence: 99%

End-to-End Unsupervised Document Image Blind Denoising

Gangeh¹,

Plata²,

Motahari³

et al. 2021

Preprint

View full text Add to dashboard Cite

Removing noise from scanned pages is a vital step before their submission to optical character recognition (OCR) system. Most available image denoising methods are supervised where the pairs of noisy/clean pages are required. However, this assumption is rarely met in real settings. Besides, there is no single model that can remove various noise types from documents. Here, we propose a unified end-toend unsupervised deep learning model, for the first time, that can effectively remove multiple types of noise, including salt & pepper noise, blurred and/or faded text, as well as watermarks from documents at various levels of intensity. We demonstrate that the proposed model significantly improves the quality of scanned images and the OCR of the pages on several test datasets.

show abstract

Section: Image Denoising In Documentsmentioning

confidence: 99%

End-to-End Unsupervised Document Image Blind Denoising

Gangeh¹,

Plata²,

Motahari³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Robust reading 9 is a common 9 https://rrc.cvc.uab.es/ term under which several approaches are collected. A recent approach in this area is DeepReader (Vishwanath et al, 2018), which is a document understanding approach which seamlessly integrates lowlevel OCR with recognition of higher-level document structure and, to a certain extent, content. Document Visual Question Answering (Mathew et al, 2020), on the other hand, analyses scanned documents beyond mere OCR of text content, including manually applied highlighting, for answering questions about the documents' content.…”

Section: Related Workmentioning

confidence: 99%

Reconstructing Manual Information Extraction with DB-to-Document Backprojection: Experiments in the Life Science Domain

Müller

Ghosh

Rey

et al. 2020

Proceedings of the First Workshop on Scholarly Document Processing

View full text Add to dashboard Cite

We introduce a novel scientific document processing task for making previously inaccessible information in printed paper documents available to automatic processing. We describe our data set of scanned documents and data records from the biological database SABIO-RK, provide a definition of the task, and report findings from preliminary experiments. Rigorous evaluation proved challenging due to lack of gold-standard data and a difficult notion of correctness. Qualitative inspection of results, however, showed the feasibility and usefulness of the task.

show abstract

“…For instance, text in advertising banners may be interpreted as valuable information. For this reason, a simple OCR detection followed by natural language processing techniques is a suboptimal for WIE (Vishwanath et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

CoVA: Context-aware Visual Attention for Webpage Information Extraction

Kumar¹,

Morabia²,

Wang³

et al. 2022

Proceedings of the Fifth Workshop on E-Commerce and NLP (ECNLP 5)

View full text Add to dashboard Cite

Webpage information extraction (WIE) is an important step to create knowledge bases. For this, classical WIE methods leverage the Document Object Model (DOM) tree of a website. However, use of the DOM tree poses significant challenges as context and appearance are encoded in an abstract manner. To address this challenge we propose to reformulate WIE as a context-aware Webpage Object Detection task. Specifically, we develop a Contextaware Visual Attention-based (CoVA) detection pipeline which combines appearance features with syntactical structure from the DOM tree. To study the approach we collect a new large-scale dataset 1 of e-commerce websites for which we manually annotate every web element with four labels: product price, product title, product image and others. On this dataset we show that the proposed CoVA approach is a new challenging baseline which improves upon prior state-of-the-art methods.

show abstract

Deep Reader: Information Extraction from Document Images via Relation Extraction and Natural Language

Cited by 6 publications

References 18 publications

End-to-End Unsupervised Document Image Blind Denoising

End-to-End Unsupervised Document Image Blind Denoising

Reconstructing Manual Information Extraction with DB-to-Document Backprojection: Experiments in the Life Science Domain

CoVA: Context-aware Visual Attention for Webpage Information Extraction

Contact Info

Product

Resources

About