Objectives: To improve the accuracy and to reduce the time complexity of the Speaker Recognition system using Mel-Frequency Cepstral Coefficients (MFCCs) and Bacterial Foraging optimization (BFO) with DNN -RBF. Method: The MFCCs of each speech sample are derived by pre-processing the audio speech signal. The features are optimized with BFO algorithm. Finally, the probability score for each speaker is generated to identify the speaker. Then the features are classified towards the target speaker using DNN-RBF. For the proposed MBFOB speaker recognition function, the TIMIT read corpus is used. It contains a total of 6300 phrases, 10 phrases each. Findings: the identity of user is validated in the fields of authentication and surveillance for recognition of speaker. By using the audio speech signal, features are extracted. This paper suggests an MBFOB solution based on Mel-frequency Cepstral Coefficients and DNN-RBF with BFO, for the identification of speakers. The speech utterance from the TIMIT data corpus is preprocessed to obtain MFCC feature vectors DNN-RBF is used for the purpose of classifying the speaker and the feature vectors in the output layers are optimized with Bacterial Foraging optimization. Finally, the scores for each speaker are calculated to identify the speaker. Different output metrics like EER, DCF, Cavg and accuracy are used to test the proposed speaker recognition technique. The execution time of this proposed method is found to be lesser than the other existing methods. The experimental findings are contrasted with other current methods and it shows the efficiency of our approach. Novelty: A novel MFCC-based Bacterial Foraging Optimization with Deep Neural Network-Radial Basis Function (DNN-RBF) for identification of exact speaker is proposed in this study.
Automatic Speech Recognition (ASR) has been an intensive research area during the recent years in internet to enable natural human–machine communication. However, the existing Deep Neutral Network (DNN) techniques need more focus on feature extraction process and recognition accuracy. Thus, an enhanced deep neural network (DNN)-based approach for speaker recognition with a novel Triumvirate Euphemism Strategy (TES) is proposed. This overcomes poor feature extraction from Mel-Frequency Cepstral Coefficient (MFCC) map by extracting the features based on petite, hefty and artistry of the features. Then, the features are trained with Silhouette Martyrs Method (SMM) without any inter-class and intra-class separability problems and margins are affixed between classes with three new loss functions, namely A-Loss, AM-Loss and AAM-Loss. Additionally, the parallelization is done by a mini-batch-based BP algorithm in DNN. A novel Frenzied Heap Atrophy (FHA) with a multi-GPU model is introduced in addition with DNN to enhance the parallelized computing that accelerates the training procedures. Thus, the outcome of the proposed technique is highly efficient that provides feasible extraction features and gives incredibly precise results with 97.5% accuracy in the recognition of speakers. Moreover, various parameters were discussed to prove the efficiency of the system and also the proposed method outperformed the existing methods in all aspects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.