Convolutional neural network (CNN) based methods have succeeded for handwritten numeral recognition (HNR) applications. However, CNN seems to misclassify similarly shaped numerals (i.e., the silhouette of the numerals that look the same). This paper presents an enhanced HNR system to improve the classification accuracy of the similarly shaped handwritten numerals incorporating the terminals points with CNN’s recognition, which can be utilized in various emerging applications related to language translation. In handwritten numerals, the terminal points (i.e., the start and end positions) are considered additional properties to discriminate between similarly shaped numerals. Start–End Writing Measure (SEWM) and its integration with CNN is the main contribution of this research. Traditionally, the classification outcome of a CNN-based system is considered according to the highest probability exposed for a particular numeral category. In the proposed system, along with such classification, its probability value (i.e., CNN’s confidence level) is also used as a regulating element. Parallel to CNN’s classification operation, SEWM measures the start–end points of the numeral image, suggesting the numeral category for which measured start–end points are found close to reference start–end points of the numeral class. Finally, the output label or system’s classification of the given numeral image is provided by comparing the confidence level with a predefined threshold value. SEWM-CNN is a suitable HNR method for Bengali and Devanagari numerals compared with other existing methods.
Handwritten numerals of different languages have various characteristics. Similarities and dissimilarities of the languages can be measured by analyzing the extracted features of the numerals. Handwritten numeral datasets are available and accessible for many renowned languages of different regions. In this paper, several handwritten numeral datasets of different languages are collected. Then they are used to find the similarity among those written languages through determining and comparing the similitude of each handwritten numerals. This will help to find which languages have the same or adjacent parent language. Firstly, a similarity measure of two numeral images is constructed with a Siamese network. Secondly, the similarity of the numeral datasets is determined with the help of the Siamese network and a new random sample with replacement similarity averaging technique. Finally, an agglomerative clustering is done based on the similarities of each dataset. This clustering technique shows some very interesting properties of the datasets. The property focused in this paper is the regional resemblance of the datasets. By analyzing the clusters, it becomes easy to identify which languages are originated from similar regions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.