BINYAS: a complex document layout analysis system

Bhowmik, Showmik; Kundu, Soumyadeep; Sarkar, Ram

doi:10.1007/s11042-020-09832-3

Cited by 19 publications

(9 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…By contrast, bottom-up methods are able to handle more kinds of documents including namely non-Manhattan layout pages and so on, but they need higher computational cost as an exchange. Hybrid methods integrate these two methods, and one of the most representative methods is CC analysis [7], [8], [9], [10], [11], [12]: CCs are detected from the the entire images first, and then researchers analyze these CCs to acquire areas of interest. Hybrid methods combine the benefits of bottom-up and top-down methods, they can handle a variety of documents with relatively fast speed.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

A Connected Components Based Layout Analysis Approach for Educational Documents

Liu

Yang

et al. 2021

2021 16th International Conference on Computer Science &Amp; Education (ICCSE)

View full text Add to dashboard Cite

Layout analysis, which aims to detect and categorize areas of interest on document images, is an increasingly important part in document image processing. Existing researches have conducted layout analysis on various documents, but none has been proposed for documents yielded from teaching, i.e. exam papers and workbooks, which are worth studying. In this paper, we propose a novel layout analysis system to achieve two tasks for workbook pages and exam papers respectively. On one hand, we segment text and non-text areas of workbook pages. On the other hand, we extract regions of interest on exam papers. Our system is based on connected component (CC) analysis, specifically, it extracts geometric features and spatial information of CCs to recognize page elements. We carried out experiments on images collected from real-world scenarios, and promising results confirmed the applicability and effectiveness of our system.

show abstract

Section: Discussionmentioning

confidence: 99%

“…To better distinguish small texts between large graphics, existing studies usually fill contours by different means: filling the whole regions according to different rules [11], [12], [9] or performing hole-filled morphological closing [10]. However, there are two problems in our scenario that filling is inapplicable to.…”

Section: A Coarse Segmentationmentioning

confidence: 99%

A Connected Components Based Layout Analysis Approach for Educational Documents

Liu

Yang

et al. 2021

2021 16th International Conference on Computer Science &Amp; Education (ICCSE)

View full text Add to dashboard Cite

show abstract

“…Image processing techniques and convolutional neural networks are commonly used in the literature for this purpose [7]- [9]. • Optical Character Recognition (OCR): The obtained images are converted into digitized text by OCR.…”

Section: • Text and Table Detection: Extraction Of Informationmentioning

confidence: 99%

“…Precision, Recall, F-Score, and Accuracy, which are commonly used in the literature, are used as a performance metrics to evaluate the success of the system. The Precision, Recall, F-Score, and Accuracy values were calculated using equation ( 5), ( 6), (7), and ( 8) respectively.…”

Section: 𝑁𝐶𝐸𝑅 = 100 𝑆 + 𝐷 + 𝐼 𝐻 + 𝑆 + 𝐷 + 𝐼mentioning

confidence: 99%

End to End Invoice Processing Application Based on Key Fields Extraction

Arslan

2022

IEEE Access

View full text Add to dashboard Cite

In this paper, an automatic invoice processing system, which is in great demand among private and public companies, was proposed. The proposed system supports all invoice file types that can be submitted by companies. Companies can easily submit invoices to the system via the web interface or email, and all invoices submitted to the system are queued and processed sequentially. If the invoice is a text file, the invoice information is extracted from the text by using template matching. If the invoice is an image, the text and table areas are detected and extracted. For table detection, we used both image processing based and YOLOv5-based deep learning method. Cell extraction was then performed from the extracted table images. As a result of these processes, all text and table cells were obtained as images and these images were converted into machine-readable text using the open-source software Tesseract OCR. Tesseract already provides trained models for English and Turkish. However, these models do not provide successful results for invoices submitted by companies in Turkish. Therefore, the new fine-tuned model trained with invoices in Turkish was used for OCR. The experimental results showed that the trained Turkish model was more accurate than the Turkish and English models provided by Tesseract. In addition, the YOLOv5-based table detection model was more accurate than the image-processing-based table detection method.

show abstract

“…Researchers have also used Haar Discrete Wavelet Transform (DWT) for segmenting text from document images by detecting the edges and then using the line feature, vector graph based on the edge map and the stroke, and finally, the text is segmented by line feature [12]. In [13] classification of text and non-text components is performed using connected components and pixel based approach. Statistical approaches have also been used for text and non-text classification on handwritten documents [14] and works on the extraction of text and graphics from different scripts of newspapers [15].…”

Section: Introductionmentioning

confidence: 99%

Segmentation-Less Extraction of Text and Non-Text Regions From JPEG 2000 Compressed Document Images Through Partial and Intelligent Decompression

et al. 2023

View full text Add to dashboard Cite

JPEG 2000 is a popular image compression technique that uses Discrete Wavelet Transform (DWT) for compression and subsequently provides many rich features for efficient storage and decompression. Though compressed images are preferred for archival and communication purposes, their processing becomes difficult due to the overhead of decompression and re-compression operations which are needed as many times the data needs to operate. Therefore in this research paper, the novel idea of direct operation over the JPEG 2000 compressed documents is proposed for extracting text and non-text regions without using any segmentation algorithm. The technique avoids full decompression of the compressed document in contrast to the conventional methods, where they fully decompress and then process. Moreover, JPEG 2000 features are explored in this research work to partially and intelligently decompress only the selected regions of interest at different resolutions and bitdepths to accomplish segmentation-less extraction of text and non-text regions. Finally Maximally Stable Extremal Regions (MSER) algorithm is used to extract the layout of segmented text and non-text regions for further analysis. Experiments have been carried out on the standard PRImA Layout Analysis Dataset leading to promising results and saving computational resources.

show abstract

BINYAS: a complex document layout analysis system

Cited by 19 publications

References 19 publications

A Connected Components Based Layout Analysis Approach for Educational Documents

A Connected Components Based Layout Analysis Approach for Educational Documents

End to End Invoice Processing Application Based on Key Fields Extraction

Segmentation-Less Extraction of Text and Non-Text Regions From JPEG 2000 Compressed Document Images Through Partial and Intelligent Decompression

Contact Info

Product

Resources

About