2009
DOI: 10.1016/j.csl.2009.01.002
|View full text |Cite
|
Sign up to set email alerts
|

Improving GMM–UBM speaker verification using discriminative feedback adaptation

Abstract: The Gaussian Mixture Model -Universal Background Model (GMM-UBM) system is one of the predominant approaches for text-independent speaker verification, because both the target speaker model and the impostor model (UBM) have generalization ability to handle "unseen" acoustic patterns. However, since GMM-UBM uses a common anti-model, namely UBM, for all target speakers, it tends to be weak in rejecting impostors' voices that are similar to the target speaker's voice. To overcome this limitation, we propose a dis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
8
0

Year Published

2009
2009
2020
2020

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 15 publications
0
8
0
Order By: Relevance
“…2). GMM-UBM can describe the differences between speakers from the point of statistics and it can generate more sophisticated model description under a less speech training set, which has better robustness [15][16].…”
Section: B Speaker Clustering Based On Model Distancementioning
confidence: 99%
See 1 more Smart Citation
“…2). GMM-UBM can describe the differences between speakers from the point of statistics and it can generate more sophisticated model description under a less speech training set, which has better robustness [15][16].…”
Section: B Speaker Clustering Based On Model Distancementioning
confidence: 99%
“…It employs every speech segment for clustering according to MAP (Maximum A Posteriori), obtains GMM (Gaussian Mixture Model) of every speech segment for clustering adaptively, then constructs affinity matrix of spectral clustering by calculating probability distance based on a finite observation sequence model. GMM-UBM (Gaussian Mixture Model -Universal Background Model) is used in this paper can describe the speakers' differences from the point of statistics adequately, GMM-UBM can produce more precise model expression under the condition of a small voice training set, which has better robustness [15][16]. Probability distance based on a finite observation sequence model in this paper considers not only the models differences but also samples, and it can overcome the shortcoming of model description that is not accurate enough under the condition of the relative short length of speech segment.…”
Section: Introductionmentioning
confidence: 99%
“…Unlike sigmoid function, the squared loss function has greater gradient for large d, which gives more penalty for the severe false verification segments. To give an intuitive illustration, we borrow a figure from (Chao et al, 2009) and show it in Fig. 3.…”
Section: Squared Loss Functionmentioning
confidence: 99%
“…3. (Chao et al, 2009) Using this squared loss function, the gradient of the objective function w.r.t the mean vector will be…”
Section: Squared Loss Functionmentioning
confidence: 99%
See 1 more Smart Citation