Speech to text for Indonesian homophone phrase with Mel Frequency Cepstral Coefficient

Bustamin, Anugrayani; Indrabayu, -; Areni, Intan Sari; Mokobombang, Novy Nra

doi:10.1109/cyberneticscom.2016.7892562

Cited by 4 publications

(3 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• Contextual difference: homophone words such as "Son" and "Sun", "write" and "Right" are almost the same in English (Indian) pronunciation but are different from their meaning [17,58]. • Style variability: fluency of speaking style affects the information available in both time domain and frequency domain of speech signal [57,67].…”

Section: Challenges For the Development Of Speech Recognition Modelmentioning

confidence: 99%

An automatic machine translation system for multi-lingual speech to Indian sign language

Dhanjal

Singh

2021

Multimed Tools Appl

View full text Add to dashboard Cite

Sign language (SL) is the best suited communication medium for hearing impaired people. Even with the advancement of technology, there is a communication gap between the hearing impaired and hearing people. The aim of this research work is to bridge this gap by developing an automatic system that translates the speech to Indian Sign Language using Avatar (SISLA). The whole system works in three phases: (i) The first phase includes the speech recognition (SR) of isolated words for English, Hindi and Punjabi in speaker independent environment (ii) The second phase translates the source language into Indian Sign Language (ISL) (iii) HamNoSys based 3D avatar represents the ISL gestures. The four major implementation modules for SISLA include: requirement analysis, data collection, technical development and evaluation. The multi-lingual feature makes the system more efficient. The training and testing speech sample files for English (12,660, 4218), Hindi (12,610, 4211) and Punjabi (12,600, 4193) have been used to train and test the SR models. Empirical results of automatic machine translation show that the proposed trained models have achieved the minimum accuracy of 91%, 89% and 89% for English, Punjabi and Hindi respectively. Sign language experts have also been used to evaluate the sign error rate through feedback. Future directions to enhance the proposed system using non-manual SL features along with the sentence level translation has been suggested. Usability testing based on survey results confirm that the proposed SISLA system is suitable for education as well as communication purpose for hearing impaired people.

show abstract

Section: Challenges For the Development Of Speech Recognition Modelmentioning

confidence: 99%

An automatic machine translation system for multi-lingual speech to Indian sign language

Dhanjal

Singh

2021

Multimed Tools Appl

View full text Add to dashboard Cite

show abstract

“…Homonymous and homophone-ambiguous sentences are not a big deal when processed and used to communicate with other human beings, but sometimes it takes a little longer to understand homonymous and Homophone ambiguous sentences (Dalrymple-Alford, 1984;Eviatar, et al, 2023;Bustamin, et al, 2016). When a computer processes ambiguous sentences, there will inevitably be errors in meaning because the computer cannot understand the words that form an ambiguous sentence, Homonym and Homophone.…”

Section: Introductionmentioning

confidence: 99%

Utilization of Fuzzy Ontology for the Meaning of Homonymous and Homophones Ambiguous Sentences

Darma Putra,

Aswi Ramadhani

2023

QAJ

View full text Add to dashboard Cite

The ambiguous sentences Homonyms and Homophones become a big problem when processed by computers. From these problems, a Novelty was found; the Novelty created a system that was able to recognize ambiguous sentences of Homonyms and Homophones. The process that the system runs for the first time is to test the proximity of the ambiguous sentences entered with the data set; from this process, the ambiguous sentences entered can already be recognized as the meaning of the sentence. The resulting result is how many per cent the level of similarity. Then the results are processed with the fuzzy ontology method. The results of the Fuzzy Ontology are low similarity level, moderate similarity level, and high similarity level. The method used to analyze this research is the confusion matrix, the precision results obtained were 92%, recall was 100%, and accuracy was 96%. In the future, this research can be used to refine translation results in a translation system.

show abstract

“…Based on that research, there is an increasing of elderly voice recognition process for up to 12% (Kwon et al, 2015). Speech recognition research for Indonesian language has been performed by Bustamin et al (2016;Areni et al, 2017b) using the Mel Frequency Cepstral Coefficient (MFCC) method for feature extraction on the word homophone. Cavus (2016) has done a research regarding to intelligent mobile application for learning English (pronunciation) by using voice recognition.…”

Section: Introductionmentioning

confidence: 99%

Speech to Text in Indonesian Personal Assistant

Areni¹,

Mufidah²,

Indrabayu³

et al. 2019

Journal of Computer Science

Self Cite

View full text Add to dashboard Cite

Short Message Service (SMS) is one of the most often used features on smartphones. Delivery of SMS messages while driving can interrupt driver's concentrations that may lead to an accident. Hence, Speech Recognition system in SMS message activity is required. In this research, the Speech Recognition system is able to convert speech as an Indonesian query in making messages, entering contacts without searching the phone list and equipped with push button for sending a message using Google Speech Recognition Application Programming Interface (API) System created using Java programming language with Android Studio Editor. The input data consist of training and testing data. Training data used is 20 voice data samples on STT message that consist of 10 different male voice and 10 different female voice samples for 7 similar words. While for testing data, 10 voice data samples are used that consist of 5 different voice samples for male and female. System performance based on Result Training Data (RTD), the Result Random Data (RRD) and Grade Success System (GSS). The results for send message show that RTD, RRD and GGS reach 100%, 96.74% and 98.37%, respectively. For add contact, the performance system obtained 100% for all parameters.

show abstract

Speech to text for Indonesian homophone phrase with Mel Frequency Cepstral Coefficient

Cited by 4 publications

References 2 publications

An automatic machine translation system for multi-lingual speech to Indian sign language

An automatic machine translation system for multi-lingual speech to Indian sign language

Utilization of Fuzzy Ontology for the Meaning of Homonymous and Homophones Ambiguous Sentences

Speech to Text in Indonesian Personal Assistant

Contact Info

Product

Resources

About