Robust speaker recognition: a feature-based approach

Mammone, Richard J.; Zhang, Xiaoyu; Ramachandran, Ravi P.

doi:10.1109/79.536825

Cited by 263 publications

(96 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Next, the amplitude values are converted to mel-filter bank outputs, and the output from each filter is log-compressed and transformed via the discrete cosine transform to cepstral coefficients. The details of these feature extraction techniques may be found in [23][24][25].…”

Section: Feature Extractionmentioning

confidence: 99%

Source microphone identification from speech recordings based on a Gaussian mixture model

Eskidere¹

2014

Turk J Elec Eng & Comp Sci

View full text Add to dashboard Cite

Abstract:Microphone identification is a specific type of media forensics that investigates whether it is possible to identify the source microphone from speech recordings. The main aim of this study is to find out which of the several feature extraction techniques are best suited to the source microphone identification systems. We perform microphone identification experiments with 16 different microphones using 3 datasets. In order to improve the results on the datasets, we also investigate the important parameters that may affect the microphone identification performance. Our experimental results show that the proposed method is comparable to the existing studies in a closed-set identification rate.

show abstract

Section: Feature Extractionmentioning

confidence: 99%

Source microphone identification from speech recordings based on a Gaussian mixture model

Eskidere¹

2014

Turk J Elec Eng & Comp Sci

View full text Add to dashboard Cite

show abstract

“…Only the envelope of the spectrum is of interest, hence to get a smoothing spectrum the LPC method is employed.A detail description of calculation of the method is given in our earlier work. [21] Here the LPC is used to estimate the main parameters of the signal. According to [22] the speech production model can be often called as linear production model or autoregressive model.…”

Section: Figure 1 Calculation Of Lpc-mfccsmentioning

confidence: 99%

A hybrid model for neurological disordered voice classification using time and frequency domain features

K¹,

Holi

2015

AIR

View full text Add to dashboard Cite

Different neurological disorders may lead to speech related problems, due to paralysis in vocal fold or weakness of the related muscles. This may modify the acoustic characteristics of the subject's voice which may provide important information for detecting certain neurological diseases. The vowel phonation which is acoustically informative and uttered by the patient with not much difficulty is collected and various acoustic features are extracted by time domain and frequency domain techniques. The use of all these features for classification may lead to a large feature space, which may lead to complexity. Hence to avoid this, in the present work experimentation is done by fusing different classifiers which are fed with features extracted from different domains. The time domain and frequency domain features are given to Support Vector Machine (SVM) and Gaussian Mixture Model (GMM) respectively, and the intermediate decision of these classifiers is given to another SVM to identify the voice signal as normal or diseased. It is observed that this hybrid classifier model has shown some improvement with a classification accuracy of 91.43% compared to single GMM classifier with an accuracy of classification of 90% with frequency domain features as input.

show abstract

“…To obtain the vectors of CMSCs the utterances were first converted to sequences of vectors of 12 LPC coefficients. These vectors were then converted to vectors of 12 LPC cepstral coefficients (LPCCs) and finally to the vectors of 12 CMSCs according to the formula [2] …”

Section: Speech Database and Feature Analysismentioning

confidence: 99%

“…The output energy of the filters were then transformed into vectors of 16 MFCCs using the discrete cosine transform [2].…”

Section: Speech Database and Feature Analysismentioning

confidence: 99%

Speaker Identification Based on Vector Quantization

Radová

Svenda

1999

Text, Speech and Dialogue

View full text Add to dashboard Cite

Abstract. In this paper a method of text-independent speaker recognition using discrete vector quantization is presented. The identification experiments were performed in a closed set of 599 speakers and two various types of features were tested: cepstral mean subtraction coefficients and mel-frequency cepstral coefficients. The effect of the various codebook size on the speaker identification performance was investigated.

show abstract

Robust speaker recognition: a feature-based approach

Cited by 263 publications

References 28 publications

Source microphone identification from speech recordings based on a Gaussian mixture model

Source microphone identification from speech recordings based on a Gaussian mixture model

A hybrid model for neurological disordered voice classification using time and frequency domain features

Speaker Identification Based on Vector Quantization

Contact Info

Product

Resources

About