In this work, we have proposed new feature vectors for spoken language identification (LID) system. The Mel frequency cepstral coefficients (MFCC) and formant frequencies derived using short-time window speech signal. Formant frequencies are extracted from linear prediction (LP) analysis of speech signal. Using these two kind of features of speech signal, new feature vectors are derived using cluster based computation. A GMM based classifier has been designed using these new feature vectors. The language specific apriori knowledge is applied on the recognition output. The experiments are carried out on OGI database and LID recognition performance is improved.
Dialect Identification is the process of identifies the dialects of particular standard language. The Telugu Language is one of the historical and important languages. Like any other language Telugu also contains mainly three dialects Telangana, Costa Andhra and Rayalaseema. The research work in dialect identification is very less compare to Language identification because of dearth of database. In any dialects identification system, the database and feature engineering play vital roles because of most the words are similar in pronunciation and also most of the researchers apply statistical approaches like Hidden Markov Model (HMM), Gaussian Mixture Model (GMM), etc. to work on speech processing applications. But in today's world, neural networks play a vital role in all application domains and produce good results. One of the types of the neural networks is Deep Neural Networks (DNN) and it is used to achieve the state of the art performance in several fields such as speech recognition, speaker identification. In this, the Deep Neural Network (DNN) based model Multilayer Perceptron is used to identify the regional dialects of the Telugu Language using enhanced Mel Frequency Cepstral Coefficients (MFCC) features. To do this, created a database of the Telugu dialects with the duration of 5h and 45m collected from different speakers in different environments. The results produced by DNN model compared with HMM and GMM model and it is observed that the DNN model provides good performance.
Telugu language is one of the historical languages and belongs to the Dravidian family. It contains three dialects named Telangana, Costa Andhra, and Rayalaseema. This paper identifies the dialects of the Telugu language. MFCC, Delta MFCC, and Delta-Delta MFCC are applied with 39 feature vectors for the dialect identification. In addition, ZCR is also applied to identify the dialects. At last combined all the MFCC and ZCR features. A standard database is created to identify the dialects of the Telugu language. Different statistical methods like HMM and GMM are applied for the classification purpose. To improve the accuracy of the model, dimensionality reduction technique PCA is applied to reduce the number of features extracted from the speech signal and applied to models. In this work, with the application of dimensionality reduction, there is an increase in the accuracy of models observed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.