2007
DOI: 10.1109/tasl.2007.894527
|View full text |Cite
|
Sign up to set email alerts
|

Speaker and Session Variability in GMM-Based Speaker Verification

Abstract: Abstract-We present a corpus-based approach to speaker verification in which maximum likelihood II criteria are used to train a large scale generative model of speaker and session variability which we call joint factor analysis. Enrolling a target speaker consists in calculating the posterior distribution of the hidden variables in the factor analysis model and verification tests are conducted using a new type of likelihood II ratio statistic. Using the NIST 1999 and 2000 speaker recognition evaluation data se… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
82
0
1

Year Published

2011
2011
2019
2019

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 192 publications
(83 citation statements)
references
References 17 publications
0
82
0
1
Order By: Relevance
“…it composed also by 1140 females, 648 males and 21 907 test files. In the NIST evaluation protocol, 2 we can use all previous NIST evaluation data and also other corpora to train our systems. For this purpose, we used all the following datasets to estimate our system hyperparameters:…”
Section: A Databasesmentioning
confidence: 99%
“…it composed also by 1140 females, 648 males and 21 907 test files. In the NIST evaluation protocol, 2 we can use all previous NIST evaluation data and also other corpora to train our systems. For this purpose, we used all the following datasets to estimate our system hyperparameters:…”
Section: A Databasesmentioning
confidence: 99%
“…The preliminary experiments of [3,8] were reported on the NIST 2002 and 2006 SRE corpora using a lightweight Gaussian mixture model-universal background model (GMM-UBM) system [17] and generalized linear discriminant sequence support vector machine (GLDS-SVM) without any session variability compensation techniques. The recent results of [36], using multi-taper MFCC features only, were reported on NIST 2002 and 2008 SRE corpora using GMM-UBM, GMM-SVM and joint factor analysis (JFA) [38,39] classifiers.…”
Section: Introductionmentioning
confidence: 99%
“…MFCC coefficients are used for extracting features and minimum processing time in GMM is 10 ms for speech utterance. The parameters for GMM model is mean vectors, densities (is a sum of M numbers component density), and covariance matrices [3]. b. SVM -SVM is a discriminative speaker model which based on targeted speaker as well as imposter speaker.…”
Section: Speaker Modelsmentioning
confidence: 99%
“…Open set includes any number of registered speakers, and there is a possibility that unknown speaker also present, known as imposter. Imposter means the voice of the person is not belonging from the specific speaker [3]. Speaker verification is a task to check whether or not a voice token belongs to a specific speaker.…”
Section: Introductionmentioning
confidence: 99%