Multi-Layout Unstructured Invoice Documents Dataset: A Dataset for Template-Free Invoice Processing and Its Evaluation Using AI Approaches

Baviskar, Dipali; Ahirrao, Swati; Kotecha, Ketan

doi:10.1109/access.2021.3096739

Cited by 17 publications

(6 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Different from the standard Natural Language Processing (NLP) tasks for text recognition from regular documents or paragraphs, it is not feasible to rely on Computer Vision (CV) and NLP alone to recognize text from unstructured documents. In the text recognition from unstructured documents, using image segmentation technology to extract text ignores semantic information, whereas the text in tables with complex structures cannot be effectively recognized with the standard NLP methods (46). In the present study, we successfully developed a deep learning system by integrating a series of methods (including image processing, table detection, text detection text recognition, and text structurization) to structurize text from UPBMR images.…”

Section: Discussionmentioning

confidence: 99%

DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports

et al. 2022

View full text Add to dashboard Cite

Background Complete electronic health records (EHRs) are not often available, because information barriers are caused by differences in the level of informatization and the type of the EHR system. Therefore, we aimed to develop a deep learning system [deep learning system for structured recognition of text images from unstructured paper-based medical reports (DeepSSR)] for structured recognition of text images from unstructured paper-based medical reports (UPBMRs) to help physicians solve the data-sharing problem. Methods UPBMR images were firstly preprocessed through binarization, image correction, and image segmentation. Next, the table area was detected with a lightweight network (i.e., the proposed YOLOv3-MobileNet model). In addition, the text of the table area was detected and recognized with the model based on differentiable binarization (DB) and convolutional recurrent neural network (CRNN). Finally, the recognized text was structured according to its row and column coordinates. DeepSSR was trained and validated on our dataset with 4,221 UPBMR images which were randomly split into training, validation, and testing sets in a ratio of 8:1:1. Results DeepSSR achieved a high accuracy of 91.10% and a speed of 0.668 s per image. In the system, the proposed YOLOv3-MobileNet model for table detection achieved a precision of 97.8% and a speed of 0.006 s per image. Conclusions DeepSSR has high accuracy and fast speed in structured recognition of text based on UPBMR images. This system may help solve the data-sharing problem due to information barriers between hospitals with different EHR systems.

show abstract

Section: Discussionmentioning

confidence: 99%

DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports

et al. 2022

View full text Add to dashboard Cite

show abstract

“…It makes use of semi-supervised neural networks for scene text recognition which is further optimized. The model was tested against benchmark detection datasets [9] and gave promising accuracy scores, [10] details the difficulty and challenges in developing a custom dataset and finding high quality varied datasets to annotate and use for training and testing, hence it was made the most out of publicly available datasets. The paper also details their efforts into automating invoice processing by using feature extraction but does not consider the tables present in invoices and the items present in the tables which also need to be processed and extracted if there is a need.…”

Section: Related Workmentioning

confidence: 99%

“…However, the use of the YOLO based detection module gave unsatisfactory results when compared to OpenCV based method, with the YOLO algorithm failing to detect many text regions. Considering the information extraction task as an image segmentation problem loses the text semantics, which can further complicate the processing of unstructured documents [10]. The existence of a dataset in order to train the network for segmentation is another problem in the case of training networks for extracting relevant information from structured documents.…”

Section: Related Workmentioning

confidence: 99%

Automated invoice data extraction using image processing

Manjunath¹,

Nayak²,

Nishith³

et al. 2023

IJ-AI

View full text Add to dashboard Cite

Manually processing invoices which are in the form of scanned photocopies is a time-consuming process. There is a need to automate the task of extraction of data from the invoices with a similar format. In this paper we investigate and analyse various techniques of image processing and text extraction to improve the results of the optical character recognition (OCR) engine, which is applied to extract the text from the invoice. This paper also proposes the design and implementation of a web enabled invoice processing system (IPS). The IPS consists of an annotation tool and an extraction tool. The annotation tool is used to mark the fields of interest in the invoice which are to be extracted. The extraction tool makes use of opensource computer vision library (OpenCV) algorithms to detect text. The proposed system was tested on more than 25 types of invoices with the average accuracy score lying between 85% and 95%. Finally, to provide ease of use, a web application is developed which also presents the results in a structured format. The entire system is designed so as to provide flexibility and automate the process of extracting details of interest from the invoices.

show abstract

“…Text regions which include key information are identified using machine learning and feature extraction methods such as Word2Vec, Glove and FastText. Besides, bidirectional long-short term memory neural model also utilized to finding key fields of invoice [30]. In another study, Graph based convolutional models that effective and robust in handling complex documents layout are used to extract and process key field information of any layout without ambiguity.…”

Section: Literature Reviewmentioning

confidence: 99%

End to End Invoice Processing Application Based on Key Fields Extraction

Arslan

2022

IEEE Access

View full text Add to dashboard Cite

In this paper, an automatic invoice processing system, which is in great demand among private and public companies, was proposed. The proposed system supports all invoice file types that can be submitted by companies. Companies can easily submit invoices to the system via the web interface or email, and all invoices submitted to the system are queued and processed sequentially. If the invoice is a text file, the invoice information is extracted from the text by using template matching. If the invoice is an image, the text and table areas are detected and extracted. For table detection, we used both image processing based and YOLOv5-based deep learning method. Cell extraction was then performed from the extracted table images. As a result of these processes, all text and table cells were obtained as images and these images were converted into machine-readable text using the open-source software Tesseract OCR. Tesseract already provides trained models for English and Turkish. However, these models do not provide successful results for invoices submitted by companies in Turkish. Therefore, the new fine-tuned model trained with invoices in Turkish was used for OCR. The experimental results showed that the trained Turkish model was more accurate than the Turkish and English models provided by Tesseract. In addition, the YOLOv5-based table detection model was more accurate than the image-processing-based table detection method.

show abstract

Multi-Layout Unstructured Invoice Documents Dataset: A Dataset for Template-Free Invoice Processing and Its Evaluation Using AI Approaches

Cited by 17 publications

References 35 publications

DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports

DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports

Automated invoice data extraction using image processing

End to End Invoice Processing Application Based on Key Fields Extraction

Contact Info

Product

Resources

About