“…There are various datasets for Indian language, depending on the script that has been used. For example, CMATERDB is a dataset for Indian script called Bangla [115], [116] and Kaggle's Tamil handwritten character dataset is another such dataset for Tamil script [117]. FIGURE 13: Sample image from CHARS74K Dataset [112] C. MNIST FIGURE 14: Sample handwritten digits from MNIST Dataset [42] The MNIST dataset is considered as one of the most used/cited dataset for handwritten digits [30], [42], [118]- [121].…”
Section: B Chars74kmentioning
confidence: 99%
“…This is the reason why a number of research articles on character recognition of Indian scripts are growing each year. researchers have used techniques like Tesseract OCR and google multilingual OCR [113], Convolutional Neural Network (CNN) [70], [114], Deep Belief Network with the distributed average of gradients feature [188], Modified Neural Network with the aid of elephant herding optimization [189], VGG (Visual Geometry Group) [117] and SVM classifier with the polynomial and linear kernel [80] VIII. RESEARCH TRENDS Characters written by different individuals create large intraclass variability, which makes it difficult for classifiers to perform robustly.…”
Given the ubiquity of handwritten documents in human transactions, Optical Character Recognition (OCR) of documents have invaluable practical worth. Optical character recognition is a science that enables to translate various types of documents or images into analyzable, editable and searchable data. During last decade, researchers have used artificial intelligence / machine learning tools to automatically analyze handwritten and printed documents in order to convert them into electronic format. The objective of this review paper is to summarize research that has been conducted on character recognition of handwritten documents and to provide research directions. In this Systematic Literature Review (SLR) we collected, synthesized and analyzed research articles on the topic of handwritten OCR (and closely related topics) which were published between year 2000 to 2019. We followed widely used electronic databases by following pre-defined review protocol. Articles were searched using keywords, forward reference searching and backward reference searching in order to search all the articles related to the topic. After carefully following study selection process 176 articles were selected for this SLR. This review article serves the purpose of presenting state of the art results and techniques on OCR and also provide research directions by highlighting research gaps.
“…There are various datasets for Indian language, depending on the script that has been used. For example, CMATERDB is a dataset for Indian script called Bangla [115], [116] and Kaggle's Tamil handwritten character dataset is another such dataset for Tamil script [117]. FIGURE 13: Sample image from CHARS74K Dataset [112] C. MNIST FIGURE 14: Sample handwritten digits from MNIST Dataset [42] The MNIST dataset is considered as one of the most used/cited dataset for handwritten digits [30], [42], [118]- [121].…”
Section: B Chars74kmentioning
confidence: 99%
“…This is the reason why a number of research articles on character recognition of Indian scripts are growing each year. researchers have used techniques like Tesseract OCR and google multilingual OCR [113], Convolutional Neural Network (CNN) [70], [114], Deep Belief Network with the distributed average of gradients feature [188], Modified Neural Network with the aid of elephant herding optimization [189], VGG (Visual Geometry Group) [117] and SVM classifier with the polynomial and linear kernel [80] VIII. RESEARCH TRENDS Characters written by different individuals create large intraclass variability, which makes it difficult for classifiers to perform robustly.…”
Given the ubiquity of handwritten documents in human transactions, Optical Character Recognition (OCR) of documents have invaluable practical worth. Optical character recognition is a science that enables to translate various types of documents or images into analyzable, editable and searchable data. During last decade, researchers have used artificial intelligence / machine learning tools to automatically analyze handwritten and printed documents in order to convert them into electronic format. The objective of this review paper is to summarize research that has been conducted on character recognition of handwritten documents and to provide research directions. In this Systematic Literature Review (SLR) we collected, synthesized and analyzed research articles on the topic of handwritten OCR (and closely related topics) which were published between year 2000 to 2019. We followed widely used electronic databases by following pre-defined review protocol. Articles were searched using keywords, forward reference searching and backward reference searching in order to search all the articles related to the topic. After carefully following study selection process 176 articles were selected for this SLR. This review article serves the purpose of presenting state of the art results and techniques on OCR and also provide research directions by highlighting research gaps.
“…VGG 16 CNN was used in this study which consisted of 13 convolution layers with pooling layers in between them, then Loss 3 classifier layer and output layer. The experiments were performed on a dataset containing 15,600 images and achieved 94.52% accuracy [11]. Kowsalya and Periasamy proposed a neural network model for handwritten Tamil character recognition with a modification using elephant herding optimization algorithm.…”
Section: Handwriting Recognition Studies In Tamil Languagementioning
Offline Handwritten character recognition is a popular and challenging area of research under pattern recognition and image processing. In this article, offline handwriting recognition methods performed in south Indian languages including Telugu, Tamil, Kannada and Malayalam are presented. A description about south Indian languages and an overview of general handwriting recognition systems are also presented briefly. Convolutional Neural Networks (CNNs) and classifier combination methods have provided better performance among proposals provided by the researchers.
“…On the other hand, the feature extraction scales down the original document and distinguishes the characters. The resultant features are provided to the classification phase to recognize the Tamil characters from overlapping Tamil characters (Pragathi et al 2019). In addition, this paper utilizes DCELM-NM to extract and classify the features for optimal recognition of Tamil characters.…”
At present, recognizing Tamil characters is considered as one of the most provoking and challenging taskssince there exist discontinuities, slanting, huge differences as well as free-style property characters. In such cases, the error value is enhanced and most of the error arises due to the chaos between the characters having analogous shapes. In addition to this, the time required for processing is also increased. To overcome such shortcomings, recognition of Tamil characters is proposed comprising of four principal stages namely Pre-processing, Segmentation, Feature extraction and classification phase. In the initial data pre-processing phase, the input images are pre-processed by employing thresholding binarization, adaptive filter for noise elimination as well as cropping. Secondly, segmentation is employed typically for verifying an object as well as various boundaries like lines, curves, bends, etc. For optimal segmentation, this paper utilizes Tsallis entropy-based atom search (TEAS) optimization algorithm. Then the segmented features are fed to extract the features and finally in the classification phase, the Tamil characters are recognized effectively. Here, this paper utilizes deep convolution extreme learning-based Newton Metaheuristic (DCELM-NM) approach for both feature extraction and classification. The performances of the proposed approach are evaluated using various simulation measures to visualize the effectiveness. In addition to this, the comparative analyses are carried out and the results reveal that the proposed approach provides superior performance when compared with existing approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.