Enhancement of a text-independent speaker verification system by using feature combination and parallel structure classifiers

Abdalmalak, Kerlos Atia; Gallardo-Antolín, Ascensión

doi:10.1007/s00521-016-2470-x

Cited by 12 publications

(5 citation statements)

References 51 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Pitch frequency is a perceptual characteristic of the speech signal with physical properties denoted by F0 and is used to improve the performance of speaker identification [23,24]. Many studies use these features for speaker identification not only separately but also in combination [25][26][27][28]. In this paper, six different feature extraction approaches, namely Mel Frequency Cepstral Coefficients (MFCC)+Pitch, Gammatone Cepstral Coefficients (GTCC)+Pitch, MFCC+GTCC+Pitch+eight spectral features, spectrograms, i-vectors, and Alexnet feature vectors were used.…”

Section: Related Workmentioning

confidence: 99%

Speaker identification using hybrid subspace, deep learning and machine learning classifiers

KESER,

GEZER

2024

Preprint

View full text Add to dashboard Cite

Speaker identification is crucial in many application areas, such as automation, security, and user experience. This study examines the use of traditional classification algorithms and hybrid algorithms, as well as newly developed subspace classifiers, in the field of speaker identification. In the study, six different feature structures were tested for the various classifier algorithms. Stacked Features-Common Vector Approach (SF-CVA) and Hybrid CVA-FLDA (HCF) subspace classifiers are used for the first time in the literature for speaker identification. In addition, CVA is evaluated for the first time for speaker recognition using hybrid deep learning algorithms. This paper is also aimed at increasing accuracy rates with different hybrid algorithms. The study includes Recurrent Neural Network-Long Short-Term Memory (RNN-LSTM), i-vector + PLDA, Time Delayed Neural Network (TDNN), AutoEncoder + Softmax (AE + Softmaxx), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Common Vector Approach (CVA), SF-CVA, HCF, and Alexnet classifiers for speaker identification. The six different feature extraction approaches consist of Mel Frequency Cepstral Coefficients (MFCC) + Pitch, Gammatone Cepstral Coefficients (GTCC) + Pitch, MFCC + GTCC + Pitch + eight spectral features, spectrograms,i-vectors, and Alexnet feature vectors. For SF-CVA, 100% accuracy was achieved in most tests by combining the training and test feature vectors of the speakers separately. RNN-LSTM, i-vector + KNN, AE + softmax, TDNN, and i-vector + HCF classifiers gave the highest accuracy rates in the tests performed without combining training and test feature vectors.

show abstract

Section: Related Workmentioning

confidence: 99%

Speaker identification using hybrid subspace, deep learning and machine learning classifiers

KESER,

GEZER

2024

Preprint

View full text Add to dashboard Cite

show abstract

“…It enables us to use many datasets that are present in the database during the training phase. There are various kinds of databases available in this field, and a list of the databases that are used in speaker recognition specifically for speaker identification and speaker verification domain [114,[168][169][170] are given in Table 11 [58,62,89,90,103,.…”

Section: Databases Used In Si and Svmentioning

confidence: 99%

Speaker Recognition through Deep Learning Techniques

Shome

Sarkar

Ghosh

et al. 2023

Period. Polytech. Elec. Eng. Comp. Sci.

View full text Add to dashboard Cite

Deep learning has now become an integral part of today's world and advancement in the field of deep learning has gained a huge development. Due to the extensive use and fast growth of deep learning, it has captured the attention of researchers in the field of speaker recognition. A detailed investigation regarding the process becomes essential and helpful to the researchers for designing robust applications in the field of speaker recognition, both in speaker verification and identification. This paper reviews the field of speaker recognition taking into consideration of deep learning advancement in the present era that boosts up this technology. The paper continues with a systematic review by firstly giving a basic idea of deep learning and its architecture with its field of application, then entering into the high-lighted portion of our paper i.e., speaker recognition which is one of the important applications of deep learning. Here we have mentioned its types, different processing techniques, challenges that come across in this technology, performance evaluation criteria, deep learning implementation frameworks, and lastly various databases used in the field of speaker identification (SI) and Speaker Verification (SV).

show abstract

“…SVM is a technique according to the theory of statistical learning applied to determine the decisive boundary via separating different classes and increasing the margin [28][29][30]. SVM is fit for non-linear data set problems and less number of training data but with huge number of input.…”

Section: Support Vector Machine (Svm)mentioning

confidence: 99%

Classification of abnormal location in medium voltage switchgears using hybrid gravitational search algorithm-artificial intelligence

et al. 2021

View full text Add to dashboard Cite

In power system networks, automatic fault diagnosis techniques of switchgears with high accuracy and less time consuming are important. In this work, classification of abnormal location in switchgears is proposed using hybrid gravitational search algorithm (GSA)-artificial intelligence (AI) techniques. The measurement data were obtained from ultrasound, transient earth voltage, temperature and sound sensors. The AI classifiers used include artificial neural network (ANN) and support vector machine (SVM). The performance of both classifiers was optimized by an optimization technique, GSA. The advantages of GSA classification on AI in classifying the abnormal location in switchgears are easy implementation, fast convergence and low computational cost. For performance comparison, several well-known metaheuristic techniques were also applied on the AI classifiers. From the comparison between ANN and SVM without optimization by GSA, SVM yields 2% higher accuracy than ANN. However, ANN yields slightly higher accuracy than SVM after combining with GSA, which is in the range of 97%-99% compared to 95%-97% for SVM. On the other hand, GSA-SVM converges faster than GSA-ANN. Overall, it was found that combination of both AI classifiers with GSA yields better results than several well-known metaheuristic techniques.

show abstract

Enhancement of a text-independent speaker verification system by using feature combination and parallel structure classifiers

Cited by 12 publications

References 51 publications

Speaker identification using hybrid subspace, deep learning and machine learning classifiers

Speaker identification using hybrid subspace, deep learning and machine learning classifiers

Speaker Recognition through Deep Learning Techniques

Classification of abnormal location in medium voltage switchgears using hybrid gravitational search algorithm-artificial intelligence

Contact Info

Product

Resources

About