An efficient method for increasing the generalization capacity of neural character recognition is presented. The network uses a biologically inspired architecture for feature extraction and character classification. The numerical methods used are, however, optimized for use on massively parallel array processors. The method for training set construction, when applied to handwritten digit recognition, yielded a writer-independent recognition rate of 92%. The activation strength produced by network recognition is an effective statistical confidence measure of the accuracy of recognition. A method of using the activation strength for reclassification is described which when applied to handwritten digit recognition reduced substitutional errors to 2.2%.
IntroductionThis paper uses a three part method for writer-independent digit recognition. First, character images are used to calculate least squares optimized Gabor components. For the digit recognition problem, 32 Gabor basis functions are used. Second, these coefficients are used as input feature vectors to a classification network trained using back-propagation learning. Finally, the activation strengths of the network are used as first-order Bayesian statistics for the separation of substitutional errors.The effectiveness of this method is strongly affected by the nature of the training set used. A new method of training set construction is presented which is based on measuring writer variance. This method is shown to increase the generalization ability of a neural character recognition model.
Network ArchitectureThe usual method for designing character recognition systems has been top down. Both special purpose hardware [1] and software [2] approaches have been used on the character recognition problem with promising results. A set of features and a method of feature extraction are selected and the resulting classification problem is solved by a neural network. In this work, we have taken a different approach. The general form of input receptor fields which are used in tasks such as binocular vision by vertebrates has been modeled using parallel Gabor functions [3]. The output of these receptor fields is coupled to small networks for detection of position [4]. In this work, an approximation to this model was constructed as shown in Figure 1 and used for handwritten digit recognition.The model was not specifically designed for character recognition and could be taught any set of images which could be represented by the Gabor functions. Gabor functions are well suited to this I-695