Spoken language is one of the distinctive characteristics of the human race. Spoken language processing is a branch of computer science that plays an important role in human-computer interaction (HCI), which has made remarkable advancement in the last two decades. This paper reviews and summarizes the acoustic, phonetic and prosody features that have been used for spoken language identification specifically for Indian languages. In addition, we also review the speech databases, which are already available for Indian languages and can be used for the purposes of spoken language identification.
Paper presents novel duration normalized feature selection technique and two-step modified hierarchical classifier to improve the accuracy of spoken language identification (SLID) using Indian languages for duration mismatched condition. Feature selection averages random forest-based importance vectors of open SMILE features of different duration utterances. Although it improves the SLID system's accuracy for mismatched training and testing durations, the performance is significantly reduced for short-duration utterances. A cascade of inter-family and intra-family classifiers with an additional class to improve false language family estimation. All India Radio data set with nine Indian languages and different utterance durations was used as speech material. Experimental results showed that 150 optimal features with the proposed modified hierarchical classifier showed the highest accuracy of 96.9% and 84.4% for 30 s and 0.2 s utterances for the same train-test duration. However, we achieved an accuracy of 98.3% and 61.9% for 15 and 0.2 s test duration when trained with 30 s duration utterance. Comparative analysis showed a significant improvement in accuracy than several SLID systems in the literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.