OCR Based Document Archiving and Indexing Using PyTesseract: A Record Management System for DSWD Caraga, Philippines

Jayoma, Jaymer M.; Moyon, Elbert S.; Morales, Edsel Matt O.

doi:10.1109/hnicem51456.2020.9400000

Cited by 10 publications

(3 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This preprocessing performs noise reduction, contrast enhancement, and resizing tasks to ensure optimal recognition accuracy. Following preprocessing, it uses the deep learning model, which identifies characters, words, the spatial position of the characters, and even complex layouts within the image [35]. Once the text is extracted from the image, postprocessing techniques are applied to enhance the accuracy of the recognized text, such as spell-checking and formatting correction.…”

Section: Optical Character Recognitionmentioning

confidence: 99%

VisFormers—Combining Vision and Transformers for Enhanced Complex Document Classification

Dutta,

Adhikary,

Dwivedi

2024

MAKE

View full text Add to dashboard Cite

Complex documents have text, figures, tables, and other elements. The classification of scanned copies of different categories of complex documents like memos, newspapers, letters, and more is essential for rapid digitization. However, this task is very challenging as most scanned complex documents look similar. This is because all documents have similar colors of the page and letters, similar textures for all papers, and very few contrasting features. Several attempts have been made in the state of the art to classify complex documents; however, only a few of these works have addressed the classification of complex documents with similar features, and among these, the performances could be more satisfactory. To overcome this, this paper presents a method to use an optical character reader to extract the texts. It proposes a multi-headed model to combine vision-based transfer learning and natural-language-based Transformers within the same network for simultaneous training for different inputs and optimizers in specific parts of the network. A subset of the Ryers Vision Lab Complex Document Information Processing dataset containing 16 different document classes was used to evaluate the performances. The proposed multi-headed VisFormers network classified the documents with up to 94.2% accuracy, while a regular natural-language-processing-based Transformer network achieved 83%, and vision-based VGG19 transfer learning could achieve only up to 90% accuracy. The model deployment can help sort the scanned copies of various documents into different categories.

show abstract

Section: Optical Character Recognitionmentioning

confidence: 99%

VisFormers—Combining Vision and Transformers for Enhanced Complex Document Classification

Dutta,

Adhikary,

Dwivedi

2024

MAKE

View full text Add to dashboard Cite

show abstract

“…For this, we used Pythontesseract, an Optical Character Recognition (OCR) tool for Python. This tool recognizes and extracts text embedded in images and is a wrapper for Google's Tesseract-OCR Engine (Jayoma et al, 2020). Through its use, a second JSON is generated containing the extracted words and the frame number where the words were identified.…”

Section: Feature Extractionmentioning

confidence: 99%

Automatic Classification of Learning Material Styles

Aquino,

Souza,

Barrére

2023

RBIE

View full text Add to dashboard Cite

Although video lessons are often used in diverse areas, the lack of a common approach to defining and classifying their styles results in using many different models for these purposes. There is a need to build a framework through which these styles can be defined and classified. Much has been done to investigate the effects of these styles on student engagement and learning outcomes. These studies suggest that video lesson styles affect academic performance and that students learn better through a certain video lesson style. Based on this, we propose a unified model for classifying video lesson styles based on the nomenclatures and definitions used in the literature. Furthermore, we present an approach for automatically classifying four popular video lesson styles. The automatic classification is useful for recommendation systems to suggest materials more consistent with student preferences and their intended learning outcomes.

show abstract

“…Python is the engine that used the PyTesseract library and it is one of the important libraries that are used for Arabic OCR, python is open source and it's easy to implement all the python libraries. Tesseract-OCR Engine is also used to detect the text in images such as line, word and character detection [3]. The optical character recognition to converts the images to the text editable with the.txt extension, then edit.py file is created to be compared between the predicate text and the truth text to check the accuracy of Tesseract OCR for recognizing the characters, that run the edit.py file by cmd command to check the accuracy and how many characters are recognized wrong and it will account the error of recognizing and issuing the final accuracy the performance of the accuracy is 99.58% accuracy.…”

Section: Introductionmentioning

confidence: 99%

Tesseract OCR Recognition Based on Arabic Machine-Printed Document

Ramteke,

Al Maamari

2023

Advances in Intelligent Systems Research

View full text Add to dashboard Cite

This paper provides technical aspects and the context of Recognizing and Detecting Arabic characters using Tesseract OCR Engine. OCR engine is freely available and gives a better result and also is supporting many languages such as Arabic etc. The procedure begins by transforming the Arabic documents into machine format (scanning) and then recognizing as well as extracting the text using the PyTesseract library. The OCR is a system that can afford the considerable values of split errors, particularly while working with cursive languages like the Arabic language with repeated overlapping between letters. Moreover, The performance is 99.5 accuracy in OCR-tesseract for converting the Arabic image documents to text editable.

show abstract

OCR Based Document Archiving and Indexing Using PyTesseract: A Record Management System for DSWD Caraga, Philippines

Cited by 10 publications

References 4 publications

VisFormers—Combining Vision and Transformers for Enhanced Complex Document Classification

VisFormers—Combining Vision and Transformers for Enhanced Complex Document Classification

Automatic Classification of Learning Material Styles

Tesseract OCR Recognition Based on Arabic Machine-Printed Document

Contact Info

Product

Resources

About