In this paper we propose a novel algorithm for optical character recognition in the presence of impulse noise by applying a wavelet transform, principal component analysis, and neural networks. In the proposed algorithm, the Haar wavelet transform is used for low frequency components allocation, noise elimination and feature extraction. The principal component analysis is used to reduce the dimension of the extracted features. We use a set of different multi-layer neural networks as classifiers for each character; the inputs are represented by a reduced set of features. One of the key features of the proposed approach is creating a separate neural network for each type of character. The experimental results show that the proposed algorithm can effectively recognize the characters in images in the presence of impulse noise; the results are comparable with ABBYY FineReader and Tesseract OCR.
This article reviews the history and state-of-the-art optical character recognition systems, such as ABBYY FineReader, Tesseract, CuneiForm, with particular attention given to their inner algorithms, including page layout analysis; page segmentation and document skew angle estimation. The overview includes the description and comparison of different methods proposed for the last 30 years in terms of speed and versatility. Critical analysis and discussions about the status of the field and open problems are reported.
This work represents a biologically inspired approach to object recognition based on analysis of hierarchical and temporal data dependencies. The article describes the hierarchical temporal memory model (HTM) and its optimization for object recognition task. Optimization includes Gabor and Canny filter image preprocessing, which makes the model suitable for handwritten symbols and gestures recognition; using of additional clustering on the stage of spatial pooling, a new proposed temporal grouping algorithm increases the overall recognition accuracy of the model; a new genetic algorithm was designed for searching the optimal parameters of the model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.