Farah Adeeba scite author profile

2019

IEEE Access

Native language identification (NLI) is the task of identifying the first language of a user based on their speech or written text in a second language. In this paper, we propose the use of spectrogramand cochleagram-based features extracted from very short speech utterances (0.8 s on average) to infer the native language of an Urdu speaker. The bidirectional long short-term memory (BLSTM) neural networks are adopted for the classification of utterances among the native languages. A set of experiments is carried out for the network architecture search and the system's accuracy is evaluated on the validation data set. Overall accuracy of 74.81% and 71.61% is achieved using the Mel-frequency cepstral coefficients (MFCC) and Gammatone frequency cepstral coefficients (GFCC), respectively. Moreover, the optimized MFCC featurebased BLSTM network and GFCC feature-based BLSTM network are merged together to take advantage of both the feature sets. The experiments show that the performance of the merged network surpasses the individual BLSTM networks and accuracy of 75.69% is achieved on the evaluation data. The effect of test data duration is also analyzed (from 0.27 s to 1.5 s); in addition, it is observed that with very short duration as 0.4 s, an accuracy of over 50% can be achieved.

show abstract

A Multi-Genre Urdu Broadcast Speech Recognition System

Khan

Rauf

Adeeba³

et al. 2021

Comparison of Urdu text to speech synthesis using unit selection and HMM based techniques

Habib

et al. 2016

Acoustic Feature Analysis and Discriminative Modeling for Language Identification of Closely Related South-Asian Languages

Circuits Syst Signal Process

2017

Enhancing Large Vocabulary Continuous Speech Recognition System for Urdu-English Conversational Code-Switched Speech

Farooq

et al. 2020