Bottleneck Feature-Based Hybrid Deep Autoencoder Approach for Indian Language Identification

Das, Himanish Shekhar; Roy, Pinki

doi:10.1007/s13369-020-04430-9

Cited by 15 publications

(13 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Over the years, several language-independent acoustic features like Shifted Delta Cepstral coefficients (SDC) [2,3], Mel Frequency Cepstral Coefficients (MFCC) [4,5], Linear Predictive Coefficients (LPC) [3], Perceptual Linear Prediction (PLP) [6] are reported to perform better for same train-test duration utterances. Although probabilistic linear discriminant analysis (PLDA) based i-vector with modified prior estimation technique [7] and exemplarbased technique [8] was reported to improve the performance of SLID system in duration mismatched conditions, it was not significant, especially for short duration utterances [9,10].…”

Section: Review Of Related Workmentioning

confidence: 99%

“…GWO selects 156 features from fusion of all 203 features with overall accuracy of 95.96% . Das et al [3] reported a nature-inspired FS algorithm by combining Binary Bat Algorithm (BBA) and Late Acceptance Hill-Climbing (LAHC) algorithm for improving SLID by selecting relevant features from MFCC, LPC, i-vector, x-vector, fusion of MFCC + DWT, and MFCC + GFCC. An optimum feature set of 972 and 1141 selected for IITM and IIT-H data sets reported accuracies 92.35% and 100% with computation time of 158 and 182 min, respectively.…”

Section: Review Of Related Workmentioning

confidence: 99%

“…An optimum feature set of 972 and 1141 selected for IITM and IIT-H data sets reported accuracies 92.35% and 100% with computation time of 158 and 182 min, respectively. Guha et al [6] [3,16].…”

Section: Review Of Related Workmentioning

confidence: 99%

“…The state-of-the-art SLID systems used Vector Quantization (VQ), Gaussian Mixture Model (GMM) [4,17], Support Vector Machine (SVM) [18][19][20], Hidden Markov Model (HMM) [21], Artificial Neural Network (ANN) [4,21,22], and Random Forest (RF) [3,6]. The modern endto-end language recognition models based on deep learning (DL) algorithms improves the performance by increasing the data set requirement and do not perform well for small data set.…”

Section: Review Of Related Workmentioning

confidence: 99%

See 3 more Smart Citations

Improving Indian Spoken-Language Identification by Feature Selection in Duration Mismatch Framework

Bakshi

Kopparapu

2021

SN COMPUT. SCI.

View full text Add to dashboard Cite

Paper presents novel duration normalized feature selection technique and two-step modified hierarchical classifier to improve the accuracy of spoken language identification (SLID) using Indian languages for duration mismatched condition. Feature selection averages random forest-based importance vectors of open SMILE features of different duration utterances. Although it improves the SLID system's accuracy for mismatched training and testing durations, the performance is significantly reduced for short-duration utterances. A cascade of inter-family and intra-family classifiers with an additional class to improve false language family estimation. All India Radio data set with nine Indian languages and different utterance durations was used as speech material. Experimental results showed that 150 optimal features with the proposed modified hierarchical classifier showed the highest accuracy of 96.9% and 84.4% for 30 s and 0.2 s utterances for the same train-test duration. However, we achieved an accuracy of 98.3% and 61.9% for 15 and 0.2 s test duration when trained with 30 s duration utterance. Comparative analysis showed a significant improvement in accuracy than several SLID systems in the literature.

show abstract

Section: Review Of Related Workmentioning

confidence: 99%

Section: Review Of Related Workmentioning

confidence: 99%

Section: Review Of Related Workmentioning

confidence: 99%

Section: Review Of Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Improving Indian Spoken-Language Identification by Feature Selection in Duration Mismatch Framework

Bakshi

Kopparapu

2021

SN COMPUT. SCI.

View full text Add to dashboard Cite

show abstract

“…The performance of the ANN is marginally better than OvA SVM. [26] 97.1 i-vector based DNN [15] 90.8 MFCC-SDC based GMM-UBM [19] 76.35 MFCC-SDC with i-vector [19] 50.45 A GMM supervector approach for spoken Indian language identification for mismatch… (Aarti Bakshi) 1119 0.2, 0.5, 1, 3, 5, 10, 15 sec segment length utterance duration testing condition. Here 4 folds (80% spoken utterances) of 30 sec segment length data-set were used to train the classifier and remaining 1 fold (20% spoken utterances) of the data-set was used for testing.…”

Section: A Match Conditionmentioning

confidence: 99%

A GMM supervector approach for spoken Indian language identification for mismatch utterance length

Bakshi

Kopparapu²

2021

Bulletin EEI

View full text Add to dashboard Cite

Gaussian mixture model-universal background model (GMM UBM) supervectors are used to identify spoken Indian languages. The supervectors are calculated from short-time MFCC, its first and sec derivatives. The UBM builds a generalized Indian language model, and mean adaptation transforms it to a duration normalized language-specific GMM. Multi-class support vector machine and artificial neural network classifiers are used to identify language labels from the supervectors. Experimental evaluations are performed using 30 sec speech utterances from nine Indian languages comprised five Indo-Aryan and four Dravidian languages, extracted from all India radio broadcast news data-set. Eight smaller duration data-sets were manually derived to study the effect of training and test duration mismatch. In mismatch conditions, identification accuracy decreases with a decrease in test and train utterance duration. Investigations showed that the 32-mixture model with ANN classifier has optimal performance.

show abstract

Autoencoder-Based Speech Features for Manipuri Dialect Identification

Devi,

Thaoroijam

2022

Lecture Notes in Electrical Engineering

View full text Add to dashboard Cite

Bottleneck Feature-Based Hybrid Deep Autoencoder Approach for Indian Language Identification

Cited by 15 publications

References 40 publications

Improving Indian Spoken-Language Identification by Feature Selection in Duration Mismatch Framework

Improving Indian Spoken-Language Identification by Feature Selection in Duration Mismatch Framework

A GMM supervector approach for spoken Indian language identification for mismatch utterance length

Autoencoder-Based Speech Features for Manipuri Dialect Identification

Contact Info

Product

Resources

About