In this work, a technique addressed to the reliable identification of very similar filled-in forms, with a reject option, is proposed. The method is based on the automatic detection of the most discriminant regions at the image level, to be used by a distance-based classifier. Experiments included multi-page form images and the results suggest that a very high accuracy can be achieved when identifying previously known types of forms, even when unpredictable fillin data and significant noise from the scanning process appear on the test images. Likewise, it is shown that the probability that an unknown type of document is not rejected is very low. The presented technique could also be applied to other kinds of documents.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.