Yooi: An Indonesian Short Message Dictation

Suyanto, Suyanto; Adityatama, Jeffry

doi:10.4156/ijiip.vol3.issue4.7

IJIIP

2012

DOI: 10.4156/ijiip.vol3.issue4.7

|View full text |Cite

Yooi: An Indonesian Short Message Dictation

Suyanto Suyanto¹,

Jeffry Adityatama²

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2014

2020

Publication Types

Select...

Article2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 1 publication

Supporting

Mentioning

Contrasting

Order By: Relevance

Syllable-Based Indonesian Automatic Speech Recognition

Galatang¹,

Suyanto²

2020

ijeei

View full text Add to dashboard Cite

The syllable-based automatic speech recognition (ASR) systems commonly perform better than the phoneme-based ones. This paper focuses on developing an Indonesian monosyllable-based ASR (MSASR) system using an ASR engine called SPRAAK and comparing it to a phoneme-based one. The Mozilla DeepSpeech-based end-to-end ASR (MDSE2EASR), one of the state-of-the-art models based on character (similar to the phoneme-based model), is also investigated to confirm the result. Besides, a novel Kaituoxu SpeechTransformer (KST) E2EASR is also examined. Testing on the Indonesian speech corpus of 5,439 words shows that the proposed MSASR produces much higher word accuracy (76.57%) than the monophone-based one (63.36%). Its performance is comparable to the character-based MDS-E2EASR, which produces 76.90%, and the character-based KST-E2EASR (78.00%). In the future, this monosyllable-based ASR is possible to be improved to the bisyllable-based one to give higher word accuracy. Nevertheless, extensive bisyllable acoustic models must be handled using an advanced method.

show abstract

Syllable-Based Indonesian Automatic Speech Recognition

Galatang¹,

Suyanto²

2020

ijeei

View full text Add to dashboard Cite

show abstract

Automatic Segmentation of Indonesian Speech into Syllables using Fuzzy Smoothed Energy Contour with Local Normalization, Splitting, and Assimilation

Suyanto¹,

Putra²

2014

J. ICT.Res.Appl.

View full text Add to dashboard Cite

Abstract. This paper discusses the usage of the short-term energy contour of speech smoothed by a fuzzy-based method to automatically segment it into syllabic units. Two new additional procedures, local normalization and postprocessing, are proposed to adapt to the Indonesian language. Testing to 220 Indonesian utterances showed that the local normalization significantly improved the performance of the fuzzy-based smoothing. In the postprocessing procedure, splitting and assimilation work in different ways. The splitting of missed short syllables sharply reduced deletion, but slightly increased insertion. On the other hand, the assimilation of a single consonant segment into an expected previous or next segment slightly reduced insertion, but increased deletion. The use of splitting gave a higher accuracy than the assimilation and combined splittingassimilation procedures, since in many cases the assimilation keeps the unexpected insertions and overmerges the expected segments.Keywords: assimilation, fuzzy-based smoothing; Indonesian language; local normalization; short-term energy contour; splitting; syllable segmentation. IntroductionInformation on syllabic units can be used to improve the performance of flat start-based automatic speech recognition (ASR) [1]- [11]. In 2010, Janakiraman et al. [11] reported that incorporating information on syllable boundaries into English ASR reduced both computational complexity and word error rate (WER) significantly compared to flat start ASR. The WER can be reduced from 13% to 4.4% and from 36% to 21.2% for TIMIT and NTIMIT databases respectively.Every language has unique characteristics. For example, English and Indonesian have different syllable patterns. A study of telephone conversations and switchboard corpus by has shown that English has 80% monosyllabic words and 85% of them are simple structures (V, VC, CV, CVC)

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yooi: An Indonesian Short Message Dictation

Cited by 2 publications

References 1 publication

Syllable-Based Indonesian Automatic Speech Recognition

Syllable-Based Indonesian Automatic Speech Recognition

Automatic Segmentation of Indonesian Speech into Syllables using Fuzzy Smoothed Energy Contour with Local Normalization, Splitting, and Assimilation

Contact Info

Product

Resources

About