2006
DOI: 10.1109/tsa.2005.853206
|View full text |Cite
|
Sign up to set email alerts
|

Real-time speaker identification and verification

Abstract: Abstract-In speaker identification, most of the computation originates from the distance or likelihood computations between the feature vectors of the unknown speaker and the models in the database. The identification time depends on the number of feature vectors, their dimensionality, the complexity of the speaker models and the number of speakers. In this paper, we concentrate on optimizing vector quantization (VQ) based speaker identification. We reduce the number of test vectors by pre-quantizing the test … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
75
0

Year Published

2010
2010
2018
2018

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 149 publications
(79 citation statements)
references
References 30 publications
3
75
0
Order By: Relevance
“…The baseline EER is 8.0% for NIST 2002 corpus (100 speakers' test signals). These EERs closely agree with published values [10], [2].…”
Section: Speaker Verification Systemsupporting
confidence: 92%
“…The baseline EER is 8.0% for NIST 2002 corpus (100 speakers' test signals). These EERs closely agree with published values [10], [2].…”
Section: Speaker Verification Systemsupporting
confidence: 92%
“…Generative models characterize the distribution of the feature vectors within the classes (speakers), whereas discriminative modeling focuses on modeling the decision boundary between the classes. For generative modeling, vector quantization (VQ) [8,28,32,45,74,80] and Gaussian mixture model (GMM) [67,68] are commonly used. For discriminative training, artificial neural networks (ANNs) [17,83] and, more recently, support vector machines (SVMs) [10,11] are representative techniques.…”
Section: Universal Background Model (Ubm)mentioning
confidence: 99%
“…Therefore research has been focusing on decreasing the computational load of identification while attempting to keep the recognition accuracy reasonably high. In a research concentrating on optimizing vector quantization (VQ) based speaker identification, the number of test vectors are reduced by pre-quantizing the test sequence prior to matching, and the number of speakers are reduced 7 by pruning out unlikely speakers during the identification process (Kinnunen et al, 2006). The best variants are then generalized to Gaussian Mixture Model (GMM) based modeling.…”
Section: Speaker Recognition On Mobile Phonementioning
confidence: 99%