In recent years, deep neural network technology has been developing rapidly, especially in the field of image recognition. However, since deep neural networks learn images based on pixel values, they can only learn the features of the image and not the meta-information that the image has. In this paper, we focused on the differences between image features and meta-information. For example, "0" and "9" are relatively similar in terms of image characteristics, but there is significant difference in terms of the numbers they actually mean. In contrast, "3" and "4" are relatively dissimilar in terms of image features, but the difference is small in terms of the values they actually mean. In order to solve problems like this example, this paper proposes a method for learning based not only on the features of the image, but also on the numerical information that the image has. Experiments were conducted on the MNIST and Kannada-MNIST datasets using three different models: DNN, CNN, and RNN. As a result, the numerical error is smaller in the proposed model than in the baseline.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.