The objective of our work is to enable the reading of fragile scrolled historical parchments without the need to physically unravel them, thus providing valuable information to a wide range of scholarly disciplines. This problem has not been investigated by the computer vision community properly yet due to the need for parchment scanning technology: standard x-ray machinery is not sufficient as there is a requirement to extract out parchment ink in addition to the parchment's underlying structure. Effective data recovery is also compromised as content from historical scrolled documents is inaccessible due to the deterioration of the parchment. We create a 3D volumetric model of a scrolled parchment's underlying geometry and perform digital unwrapping of the parchment, producing a readable image of the text as an output. The proposed recovery framework consists of structure preserving anisotropic filtering in combination with robust segmentation, surface modelling and ink projection. We demonstrate with real examples how our algorithm is able to recover the underlying text and to solve the major challenge for scrolled parchment analysis, namely segmentation of connected layers and processing the data without user interaction.
In this paper we introduce a framework for the segmentation of scanned scrolled parchments, based on a novel graph cut based approach with an additional shape prior, in combination with anisotropic diffusion and geometry-constrained postprocessing. This problem has not been investigated by the computer vision community properly yet due to the parchment scanning technology novelty, and is extremely important for effective data recovery from historical scrolled documents whose content is inaccessible due to the deterioration of the parchment. To date, parchment segmentation has required user interaction, which is very time consuming for such data. We demonstrate with real examples how our algorithm is able to solve the major problem for scrolled parchment analysis, namely segment connected layers, and process the data without user interaction.
Abstract. In this paper, we address the problem of automating the partial representation from real world data with an unknown a priori structure. Such representation could be very useful for the further construction of an automatic hierarchical data model. We propose a three stage process using data normalisation and the data intrinsic dimensionality estimation as the first step. The second stage uses a modified sparse Non-negative matrix factorization (sparse NMF) algorithm to perform the initial segmentation. At the final stage region growing algorithm is applied to construct a mask of the original data. Our algorithm has a very broad range of a potential applications, we illustrate this versatility by applying the algorithm to several dissimilar data sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.