Optical character recognition system for Baybayin scripts using support vector machine

Pino, Rodney; Mendoza, Renier; Sambayan, Rachelle

doi:10.7717/peerj-cs.360

Cited by 10 publications

(17 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To the best of our knowledge, the proposed system is the first of its kind for recognizing Baybayin scripts at word level. The system relies heavily on previous work on Baybayin character recognition Pino, Mendoza & Sambayan (2021). The method is tested on a novel dataset found in Pino (2021a), where Full-size DOI: 10.7717/peerjcs.596/fig- 11 it contains 1000 Baybayin word images and yielded a competitive recognition accuracy of 97.9%.…”

Section: Discussionmentioning

confidence: 99%

“…The last assumption is to guarantee that the characters in the Baybayin word will be correctly extracted. Full-size DOI: 10.7717/peerjcs.596/fig- 3 The classification process in the proposed algorithm relies on the two SVM classifiers generated in Pino, Mendoza & Sambayan (2021), namely, Baybayin characters classifier and the Baybayin diacritic classifier. SVM is one of the well-known classification algorithms in supervised machine learning.…”

Section: Proposed Systemmentioning

confidence: 99%

“…In Pino, Mendoza & Sambayan (2021), a Baybayin character recognition system has been proposed using SVM, which is a classification algorithm with extensive applications in data categorization (Bishop, 2006). SVM has attracted researchers because of its robustness and high recognition accuracy (Thomé, 2012).…”

Section: Introductionmentioning

confidence: 99%

“…Applications of SVM can be found in various fields of science and engineering ( Thomé, 2012 ; Sapankevych & Sankar, 2009 ; Nayak, Naik & Behera, 2015 ; Yang, 2004 ; Rivero, Lemence & Kato, 2017 ; Rivero & Kato, 2018 ; Do & Le, 2019 ; Le et al, 2019 ; Le, 2019 ; Byun & Lee, 2003 ). The OCR system proposed by Pino, Mendoza & Sambayan (2021) consists of four SVM classification models, all of which have recognition rates above 96% accuracy.…”

Section: Introductionmentioning

confidence: 99%

“…The Baybayin word recognition algorithm proposed in this study relies heavily on the OCR system proposed in Pino, Mendoza & Sambayan (2021) . For brevity, we will refer to this method as the SVM-OCR system.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

A Baybayin word recognition system

Pino

Mendoza

Sambayan

2021

PeerJ Computer Science

Self Cite

View full text Add to dashboard Cite

Baybayin is a pre-Hispanic Philippine writing system used in Luzon island. With the effort in reintroducing the script, in 2018, the Committee on Basic Education and Culture of the Philippine Congress approved House Bill 1022 or the ”National Writing System Act,” which declares the Baybayin script as the Philippines’ national writing system. Since then, Baybayin OCR has become a field of research interest. Numerous works have proposed different techniques in recognizing Baybayin scripts. However, all those studies anchored on the classification and recognition at the character level. In this work, we propose an algorithm that provides the Latin transliteration of a Baybayin word in an image. The proposed system relies on a Baybayin character classifier generated using the Support Vector Machine (SVM). The method involves isolation of each Baybayin character, then classifying each character according to its equivalent syllable in Latin script, and finally concatenate each result to form the transliterated word. The system was tested using a novel dataset of Baybayin word images and achieved a competitive 97.9% recognition accuracy. Based on our review of the literature, this is the first work that recognizes Baybayin scripts at the word level. The proposed system can be used in automated transliterations of Baybayin texts transcribed in old books, tattoos, signage, graphic designs, and documents, among others.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Proposed Systemmentioning

confidence: 99%