DocEnTr: An End-to-End Document Image Enhancement Transformer

Souibgui, Mohamed Ali; Biswas, Sanket; Jemni, Sana Khamekhem; Kessentini, Yousri; Fornés, Alícia; Lladós, Josep; Pal, Umapada

doi:10.48550/arxiv.2201.10252

Cited by 5 publications

(7 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To obtain a CF, we first convert the ruler into a binary image where unit markers are white and everything else is black. We binarize each ruler in three different ways: threshold sweep, segmentation (DocEnTr; Souibgui et al, 2022), and skeletonization. Finally, we use another machine learning network, an image classifier, to determine whether the binarization was successful.…”

Section: Plant Component Detectormentioning

confidence: 99%

From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2

Weaver,

Smith

2023

Appl Plant Sci

View full text Add to dashboard Cite

PremiseQuantitative plant traits play a crucial role in biological research. However, traditional methods for measuring plant morphology are time consuming and have limited scalability. We present LeafMachine2, a suite of modular machine learning and computer vision tools that can automatically extract a base set of leaf traits from digital plant data sets.MethodsLeafMachine2 was trained on 494,766 manually prepared annotations from 5648 herbarium images obtained from 288 institutions and representing 2663 species; it employs a set of plant component detection and segmentation algorithms to isolate individual leaves, petioles, fruits, flowers, wood samples, buds, and roots. Our landmarking network automatically identifies and measures nine pseudo‐landmarks that occur on most broadleaf taxa. Text labels and barcodes are automatically identified by an archival component detector and are prepared for optical character recognition methods or natural language processing algorithms.ResultsLeafMachine2 can extract trait data from at least 245 angiosperm families and calculate pixel‐to‐metric conversion factors for 26 commonly used ruler types.DiscussionLeafMachine2 is a highly efficient tool for generating large quantities of plant trait data, even from occluded or overlapping leaves, field images, and non‐archival data sets. Our project, along with similar initiatives, has made significant progress in removing the bottleneck in plant trait data acquisition from herbarium specimens and shifted the focus toward the crucial task of data revision and quality control.

show abstract

Section: Plant Component Detectormentioning

confidence: 99%

From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2

Weaver,

Smith

2023

Appl Plant Sci

View full text Add to dashboard Cite

show abstract

“…Transformers have also been utilized for tasks like image restoration [24] and image de-warping [25]. [26] proposed a fully transformer-based approach for document image enhancement, without the need for any CNN. However since their approach is entirely based on conventional ViT without any design change, it fails to capture the local information from the patches.…”

Section: Transformers For Document Image Binarizationmentioning

confidence: 99%

Effective Document Image Enhancement Using tokens-to-token Transformer Network

Biswas¹,

Roy²,

Pal

2023

SSRN Journal

View full text Add to dashboard Cite

“…The disadvantage is that the training is slow due to the need to generate images of different channels. In the same year, Souibgui et al proposed an encoder-decoder architecture based on Vision Transformer [41], as shown in Figure 7. The degraded image is first divided into several patches, which are then fed into the encoder, where the patch is mapped to a potential representation of each token during the encoding process, where each token corresponds one-to-one.…”

Section: Handwriting Fading Problemmentioning

confidence: 99%

A Review of Document Image Enhancement Based on Document Degradation Problem

et al. 2023

View full text Add to dashboard Cite

Document image enhancement methods are often used to improve the accuracy and efficiency of automated document analysis and recognition tasks such as character recognition. These document images could be degraded or damaged for various reasons including aging, fading handwriting, poor lighting conditions, watermarks, etc. In recent years, with the improvement of computer performance and the continuous development of deep learning, many methods have been proposed to enhance the quality of these document images. In this paper, we review six tasks of document degradation, namely, background texture, page smudging, fading, poor lighting conditions, watermarking, and blurring. We summarize the main models for each degradation problem as well as recent work, such as the binarization model that can be used to deal with the degradation of background textures, lettering smudges. When facing the problem of fading, a model for stroke connectivity can be used, while the other three degradation problems are mostly deep learning models. We discuss the current limitations and challenges of each degradation task and introduce the common public datasets and metrics. We identify several promising research directions and opportunities for future research.

show abstract

DocEnTr: An End-to-End Document Image Enhancement Transformer

Cited by 5 publications

References 0 publications

From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2

From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2

Effective Document Image Enhancement Using tokens-to-token Transformer Network

A Review of Document Image Enhancement Based on Document Degradation Problem

Contact Info

Product

Resources

About