Script image segmentation of a document image is the most decisive step to the success of the process of transliteration of the script image into another script, such as automatically transliterating a printed Javanese manuscript image into a Latin manuscript. This paper gives an example of the application of profile projection modification to the segmentation of Javanese script document image of the entire 87 pages of the document image of HamongTani book. Based on the output of the developed system, the average percentage of correctness is 84.255% with the average standard deviation of 14.093%. This value of average percentage of correctness shows that the model developed for the Java script document image segmentation of the HamongTani book is relatively good.
Manuscript preprocessing is the earliest stage in transliteration process of manuscripts in Javanese scripts. Manuscript preprocessing stage is aimed to produce images of letters which form the manuscripts to be processed further in manuscript transliteration system. There are four main steps in manuscript preprocessing, which are manuscript binarization, noise reduction, line segmentation, and character segmentation for every line image produced by line segmentation. The result of the test on parts of PB.A57 manuscript which contains 291 character images, with 95% level of confidence concluded that the success percentage of preprocessing in producing Javanese character images ranged 85.9% -94.82%.
Many Javanese manuscripts in Indonesia are stored in museums and libraries. Most of these manuscripts were written using local scripts that are rarely used in everyday life, and hence a software application that can help and improve the reading of these manuscripts is valuable. An essential step in automatic manuscript image transliteration is post-processing, which involves editing and concatenating syllables into words. The main problem of post-processing is that there exists no symbol for space between words in a sentence, which is called the scriptio-continua problem. This paper proposes methods based on the backtracking algorithm to solve the scriptio continua in the post-processing step of Javanese manuscript image transliteration. The proposed methods use a depth-first search in seeking relevant candidate words to determine whether to merge a new syllable or not. The results of the proposed methods to concatenate 17,687 syllables from the Hamong Tani book using a dictionary containing 49,801 words are found to be satisfactory in terms of computation and accuracy. The accuracy of the implemented greedy and brute-force methods is both 81.64%. However, the greedy-based method is more efficient and has a better performance than the brute-force method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.