TEDxSK and JumpSK: A New Slovak Speech Recognition Dedicated Corpus

Staš, Ján; Hládek, Daniel; Viszlay, Peter; Koctur, Tomas

doi:10.1515/jazcas-2017-0044

Cited by 3 publications

(1 citation statement)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Today, the internet has various resource types, for example, social media, blogs, twitter, and new portals, which offer a lot of speech data and which can be freely downloaded. Moreover, it has been proved that the corpora created on internet resources yielded promising results [8] [9]. Therefore, speech data was collected first from the web news.…”

Section: Collecting the Data From The Web Newsmentioning

confidence: 99%

UCSY-SC1: A Myanmar speech corpus for automatic speech recognition

Mon¹,

Pa²,

Thu³

2019

IJECE

View full text Add to dashboard Cite

This paper introduces a speech corpus which is developed for Myanmar Automatic Speech Recognition (ASR) research. Automatic Speech Recognition (ASR) research has been conducted by the researchers around the world to improve their language technologies. Speech corpora are important in developing the ASR and the creation of the corpora is necessary especially for low-resourced languages. Myanmar language can be regarded as a low-resourced language because of lack of pre-created resources for speech processing research. In this work, a speech corpus named UCSY-SC1 (University of Computer Studies Yangon - Speech Corpus1) is created for Myanmar ASR research. The corpus consists of two types of domain: news and daily conversations. The total size of the speech corpus is over 42 hrs. There are 25 hrs of web news and 17 hrs of conversational recorded data.<br />The corpus was collected from 177 females and 84 males for the news data and 42 females and 4 males for conversational domain. This corpus was used as training data for developing Myanmar ASR. Three different types of acoustic models such as Gaussian Mixture Model (GMM) - Hidden Markov Model (HMM), Deep Neural Network (DNN), and Convolutional Neural Network (CNN) models were built and compared their results. Experiments were conducted on different data sizes and evaluation is done by two test sets: TestSet1, web news and TestSet2, recorded conversational data. It showed that the performance of Myanmar ASRs using this corpus gave satisfiable results on both test sets. The Myanmar ASR using this corpus leading to word error rates of 15.61% on TestSet1 and 24.43% on TestSet2.<br /><br />

show abstract

Section: Collecting the Data From The Web Newsmentioning

confidence: 99%