Feature Selection Based on Genetic Algorithms for Speaker Recognition

Zamalloa, Maider; Bordel, Germán; Rodríguez, Luis Javier; Peñagarikano, Mikel

doi:10.1109/odyssey.2006.248087

Cited by 31 publications

(22 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Candidate solutions are represented by individuals (or chromosomes) in a large population. Initial solutions may be randomly generated or obtained by other means [1]. Then GAs iteratively drive the population to an optimal point according to a complex metric (called fitness or evaluation function) that measures the performance of the individuals in a target task.…”

Section: Genetic Algorithmsmentioning

confidence: 99%

Feature selection algorithms for automatic speech recognition

Kalamani

Valarmathy

Poonkuzhali

et al. 2014

2014 International Conference on Computer Communication and Informatics

View full text Add to dashboard Cite

Speech is one of the most promising models through which various human emotions such as happiness, anger, sadness, and normal state can be determined, apart from facial expressions. Researchers have proved that acoustic parameters of a speech signal such as energy, pitch, Mel frequency Cepstral Coefficient (MFCC) are vital in determining the emotion state of a person. There is an increasing need for a new Feature selection method, to increase the processing rate and recognition accuracy of the classifier, by selecting the discriminative features. This study investigates the various feature selection algorithms, used for selecting the optimal features from speech vectors which are extracted using MFCC. The feature selected is then used in the modeling stage.

show abstract

Section: Genetic Algorithmsmentioning

confidence: 99%

Feature selection algorithms for automatic speech recognition

Kalamani

Valarmathy

Poonkuzhali

et al. 2014

2014 International Conference on Computer Communication and Informatics

View full text Add to dashboard Cite

show abstract

“…[2] For feature set reduction many methods are suggested because on account of identification of a particular speaker we get a lot of features which need to be reduced to only the features capable of representing someone PCA(Principal Component Analysis) is one such technique where the eigenvectors are calculated which are then sorted in descending order and a projection matrix is built finally known as the Karhunen-Loeve Transforn(KLT)with the largest K eigenvectors.KLT decorrelates the features and gives the smallest possible reconstruction error among all linear transforms, i.e. the smallest possible mean-square error between the data vectors in the original D-feature space andthe data vectors in the projection K-feature [1] Linear Discriminant Analysis (LDA) attempts to find the transform A that maximizes a criterion of class separability.This is done by computing the within-class and between class variance matrices, W and B, then finding the eigenvectors of W and B, sorting them according to the eigen values, in descending order, and finally building the projection matrix A with the largest K eigenvectors (which define the K most discriminative hyperplanes). LDA assumes that all classes share a common within-class covariance, anda single Gaussian distribution per class.…”

Section: Feature Selectionmentioning

confidence: 99%

“…Therefore it is always wiser to go for low level features because not only they can be extracted easily but they do not require a lot of data. In most of the speech and speaker recognition systems MFCC are used because of the fact that these features apart from identifying the frequency distribution also tells the glottal sources and the vocal tract shape and length, which are the features specific to a speaker [1].Speaker recognition is being used a biometric in various security applications where the people are recognized on the basis of their voice.The main tedious task in the speaker recognition is the enrollment phase where the users are asked to input their voice via a microphone or any input device which is further used as a database in the recognition phase for identification of the correct speaker.There are some other techniques also which can be used apart from enrollment phase because it is not always possible to get inputs from various speakers.…”

Section: Introductionmentioning

confidence: 99%

Detailed analysis of Speaker Recognition System and use of MFCCs for recognition

Puri¹

2013

IOSRJEN

View full text Add to dashboard Cite

-This paper presents the reader with the complete analysis of speaker recognition system with the explanation of each step with their necessity of existence. Every speaker recognition system comprises broadly of two phases namely the enrollment phase and the verification phase. The system in its first phase uses the record function for collecting the voices of different speakers and save them.Later these voices can be used to form a codebook and compare the voice with that in the enrollment phase. Enrollment phase either use feature extraction coefficients(like MFCCs,LPCCs,LPCs etc.)or speaker models like GMM,LDA(Linear Discriminant Analysis) or Factor Analysis(FA)techniques for extracting the features to serve as an identity of the speaker.The verification phase is simply the phase in which the voices in the enrollment and this phase are tested for the likelihood between them and then identify the speaker on the basis of the likelihood ratio.

show abstract

“…Because the 128-elemet feature vector is still too high to train a TNFN, there is a requirement of finding a dimensionality reduction method to lower the dimension of the feature vector. According to [21], genetic algorithm outperformed than principal component analysis and linear discriminate analysis as dealing with their speaker recognition case. Thus, in our image alignment case, we adopted genetic algorithm method described in [22] to reduce a 128-elemet into a 33-element feature vector in the experimental section.…”

Section: Wgoh Descriptormentioning

confidence: 99%

Efficient and accurate image alignment using TSK-type neuro-fuzzy network with data-mining-based evolutionary learning algorithm

Hsu

Cheng

Lin

2011

EURASIP J. Adv. Signal Process.

View full text Add to dashboard Cite

Image alignment is considered a key problem in visual inspection applications. The main concerns for such tasks are fast image alignment with subpixel accuracy. About this, neural network-based approaches are very popular in visual inspection because of their high accuracy and efficiency of aligning images. However, such methods are difficult to identify the structure and parameters of neural network. In this study, a Takagi-Sugeno-Kang-type neurofuzzy network (NFN) with data-mining-based evolutionary learning algorithm (DMELA) is proposed. Compared with traditional learning algorithms, DMELA combines the self-organization algorithm (SOA), data-mining selection method (DMSM), and regularized least square (RLS) method to not only determine a suitable number of fuzzy rules, but also automatically tune the parameters of NFN. Experimental results are shown to demonstrate superior performance of the DMELA constructed image alignment system over other typical learning algorithms and existing alignment systems. Such system is useful to develop accurate and efficient image alignment systems.

show abstract

Feature Selection Based on Genetic Algorithms for Speaker Recognition

Cited by 31 publications

References 7 publications

Feature selection algorithms for automatic speech recognition

Feature selection algorithms for automatic speech recognition

Detailed analysis of Speaker Recognition System and use of MFCCs for recognition

Efficient and accurate image alignment using TSK-type neuro-fuzzy network with data-mining-based evolutionary learning algorithm

Contact Info

Product

Resources

About