Vincent Rabeux scite author profile

Vincent Rabeux

4Publications

10Citation Statements Received

31Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Bordeaux

Publications

Order By: Most citations

Quality evaluation of degraded document images for binarization result prediction

Rabeux¹,

Journet²,

Vialard³

et al. 2013

IJDAR

View full text Add to dashboard Cite

International audienceThis article proposes an approach to predict the result of binarization algorithms on a given docu- ment image according to its state of degradation. In- deed, historical documents suffer from different types of degradation which result in binarization errors. We intend to characterize the degradation of a document image by using different features based on the inten- sity, quantity and location of the degradation. These features allow us to build prediction models of bina- rization algorithms that are very accurate according to R2 values and p-values. The prediction models are used to select the best binarization algorithm for a given doc- ument image. Obviously, this image-by-image strategy improves the binarization of the entire dataset

show abstract

Ancient documents bleed-through evaluation and its application for predicting OCR error rates

Rabeux

Journet

Domenger

2011

View full text Add to dashboard Cite

This article presents a way to evaluate the bleed-through defect on very old document images. We design measures to quantify and evaluate the verso ink bleeding through the paper onto the recto side. Measuring the bleed-through defect alows us to perform statistical analysis that are able to predict the feasibility of different post-scan tasks. In this article we choose to illustrate our measures by creating two OCR error rate predicting models based bleed-through evaluation. Two models are proposed, one for Abbyy FineReader * which is a very power-full commercial OCR and OCRopus † which is sponsored by Google. Both prediction models appears to be very accurate when calculating various statistic indicators.

show abstract

Quality Evaluation of Ancient Digitized Documents for Binarization Prediction

Rabeux

Journet

Vialard

et al. 2013

View full text Add to dashboard Cite

This article proposes an approach to predict the result of binarization algorithms on a given document image according to its state of degradation. Indeed, historical documents suffer from different types of degradation which result in binarization errors. We intend to characterize the degradation of a document image by using different features based on the intensity, quantity and location of the degradation. These features allow us to build prediction models of binarization algorithms that are very accurate according to R 2 values and p-values. The prediction models are used to select the best binarization algorithm for a given document image. Obviously, this image-by-image strategy improves the binarization of the entire dataset.

show abstract

Document Recto-verso Registration Using a Dynamic Time Warping Algorithm

Rabeux

Journet

Philippe

2011

View full text Add to dashboard Cite

Recto verso registration is an important step allowing detection of missing digitized pages, or location of the bleed-through defect over a page. An efficient way to restore or evaluate the bleed-through of a digitized document consists in analyzing at the same time both the recto side and the verso side. This method requires the two images to be aligned, registered. Without particular knowledge about document, recto verso registration is complex. Indeed, the only information that we can use to register the two is the bleed-through. Recto verso registration is complex because the recto's bleed-through is a highly degraded version of verso's ink pixels. Therefore, in this particular context, usual image comparison methods [1] are not very relevant. Nevertheless, document recto verso registration algorithms has been proposed [2], [3] [4], but these methods have important time computation costs, are noise sensitive and even fail in some cases where bleed-through is too light. The previous techniques are based on a pixel to pixel approach where the bleed-through is considered to be just a set of grey pixels. In this article, we consider the structure of the ink pixels on the verso page. The recto verso registration method presented here is based on the fact that bleed-through has the same structure that the ink on the verso side. The method registers the recto's bleed-through layout and the verso's ink layout, in two main steps, first a de-skewing algorithm is applied to both pages then, horizontal and vertical profiles are extracted and aligned with a dynamic time warping. The time complexity of our method is linear according to the image size. Moreover, experiments detailed at the end show the accuracy of our method.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Vincent Rabeux

Quality evaluation of degraded document images for binarization result prediction

Ancient documents bleed-through evaluation and its application for predicting OCR error rates

Quality Evaluation of Ancient Digitized Documents for Binarization Prediction

Document Recto-verso Registration Using a Dynamic Time Warping Algorithm

Contact Info

Product

Resources

About