Document Image Analysis and Recognition (DIAR) technique is used to recognize text component and translate it into editable format. Scripts are a set of graphical representations used to express a particular writing system as well as subsets belonging to a particular writing system. The writing styles of more than one script family may then be adopted by one language, such as in the cases where the old Malay language (Jawi) adopts the Arabic script while the modern one adopts the Roman script. The seven major scripts used in this research are in handwritten style including Arabic, Devanagari, Hebrew, Thai, Greek, Cyrillic and Korean. Automatic Multilingual Script Recognition (AMSR) is one of the main challenges in DIAR domain. Currently, only few attempts have been made for automated script identification of off-line handwritten documents images. Most available AMSR applications only deal with printed documents and script types, and they neglect handwritten and multilingual documents. The objective of this study is to propose a multilingual AMSR framework. The research methodology consists of a proposed multilingual AMSR framework. The multilingual AMSR framework is tested on Multilingual-HW datasets, which contains more than seven international unconstraint handwritten scripts, using Grey-Level Co-occurrence Matrix and Local Binary Pattern. The average accuracy of both methods is about 97.01% and 85.29% respectively. This proposed multilingual AMSR is hoped to be beneficial to a group of community which requires automatic sorting multilingual documents. This research can also be extended to document forensic area or international relations agency to identify unknown native document.
In this paper, a novel rotation and scale invariant approach for texture classification based on Gabor filters has been proposed. These filters are designed to capture the visual content of the images based on their impulse responses which are sensitive to rotation and scaling in the images. The filter responses are rearranged according to the filter exhibiting the response having largest amplitude, followed by the calculation of patterns after binarizing the responses based on a particular threshold. This threshold is obtained as the average energy of Gabor filter responses at a particular pixel. The binary patterns are converted to decimal numbers, the histograms of which are used as texture features. The proposed features are used to classify the images from two famous texture datasets: Brodatz, CUReT and UMD texture albums. Experiments show that the proposed feature extraction method performs really well when compared with several other state-of-the-art methods considered in this paper and is more robust to noise.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.