Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits

Sudhakaran, Prathibha; Yadav, Ashwani Kumar; Karamchandani, Sunil

doi:10.1088/1742-6596/1998/1/012024

Cited by 1 publication

(2 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although various tools including HTK, Sphinx, and Kaldi have been selected and designed for building HMM speech-based audio data processing for particular ASR isolated word-level recognition [33]. HTK, the most popular toolkit for building Hidden Markov models, was created especially for the implementation of speech-based isolated word recognition [31]. Terefore, HTK toolkits were selected for the investigation of Afan Oromo isolated speech-based recognition computer commands.…”

Section: Te Htk Software Toolkitmentioning

confidence: 99%

“…A signifcant change in overall accuracy in speech models is observed because of advancements in open source toolkits HTK, CMU-Sphinx, and Kaldi and their fastprocessing speed-based ASR speech recognition. Te performance of a speech system is difcult because it is dependent on variations in speakers, their pronunciations, the rate at which they speak, and the dialects of the regions they belong to [31]. ASR speech-based computer command recognizer accuracy varies with ambiguity and vocabulary size; hence, hybrid HMM works best for large vocabulary and HMM works best for small vocabulary [32].…”

mentioning

confidence: 99%

See 1 more Smart Citation

Afan Oromo Speech-Based Computer Command and Control: An Evaluation with Selected Commands

Teshite,

Mamo,

Calpotura

2023

Advances in Human-Computer Interaction

View full text Add to dashboard Cite

Speech-based computer command and control utilize natural speech to enable computers to understand human language and execute tasks through commands. However, there has been no study or development of a speech-based command and control system for Microsoft Word in Afan Oromo. The primary aim of this research is to investigate and develop a speech-based command and control system for Afan Oromo using a selected set of command-and-control words from MS Word. To accomplish this objective, a speech recognizer was developed using the HTK toolkit, employing a small vocabulary, isolated words, speaker independence, and HMM-based techniques. The translation of the selected MS command words from English to Afan Oromo was completed in order to develop this automatic speech-based computer command system. Audio recordings were obtained from 38 speakers (16 females and 22 males) aged between 18 and 40 years, based on their availability. Word-level speech recognition was performed using MFCC and data processing, which are widely used and are effective approaches in speech recognition. Out of a total of 64 MS command words, 54 words (84.37%) were used for training and 10 words (15.63%) were used for testing. Live and nonlive evaluation techniques were employed to assess the performance of the recognizer. The live recognizer, which considers variations in the environment, outperformed the nonlive recognizer due to the influence of neighboring phones. The performance results for the monophone tied state, triphone, and triphone-based recognizers were 78.12%, 86.87%, and 88.99%, respectively. Thus, the triphone-based recognizer exhibited the best performance among the nonlive recognizers. The challenges of limited resources in this research study were limited to investigate speech-based commands for computers using only selected MS commands, which play a crucial role in text processing. In order to evaluate a speech-based interface in a real environment, there were no components available for object-as-a-service. The experimental findings of this study demonstrated that if an adequate amount of language resources was available, a computer-based Afan Oromo speech-based interface for command-and-control purposes could be developed.

show abstract

Section: Te Htk Software Toolkitmentioning

confidence: 99%

mentioning

confidence: 99%