Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-683
|View full text |Cite
|
Sign up to set email alerts
|

Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification

Abstract: Short duration speaker verification is a challenging problem partly due to utterance duration mismatch. This paper proposes a novel method that modifies the standard Gaussian probabilistic linear discriminant analysis (G-PLDA) to use two separate generative models for i-vectors from long and short utterances which are jointly trained. The proposed twin model G-PLDA employs distinct models for i-vectors corresponding to different durations from the same speaker but shares the same latent variables. Unlike the s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
4
0

Year Published

2017
2017
2018
2018

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 13 publications
0
4
0
Order By: Relevance
“…In previous work, the authors proposed the twin model G‐PLDA (denoted as TM G‐PLDA) which explicitly modelled the differences between i‐vector distributions from long and short duration within each speaker [4] as x={1em4ptμnormalL+ΦnormalLh+εnormalL,thickmathspacethickmathspacethickmathspacethickmathspaceforlongutterancesμnormalS+ΦnormalSh+εnormalS,thickmathspacethickmathspacethickmathspacethickmathspaceforshortutteranceswhere x denotes the i‐vector; μnormalL and μnormalS are mean vectors for i‐vector correspond to long and short utterances, respectively; ΦnormalL and ΦnormalS are the corresponding factor loading matrices; latent variable h follows Nfalse(0,Ifalse) and is shared by all the utterances from the same speaker. The residuals εnormalL and εnormalS are assumed to follow Nfalse(0,ΣnormalLfalse) and Nfalse(0,ΣnormalSfalse), respectively.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…In previous work, the authors proposed the twin model G‐PLDA (denoted as TM G‐PLDA) which explicitly modelled the differences between i‐vector distributions from long and short duration within each speaker [4] as x={1em4ptμnormalL+ΦnormalLh+εnormalL,thickmathspacethickmathspacethickmathspacethickmathspaceforlongutterancesμnormalS+ΦnormalSh+εnormalS,thickmathspacethickmathspacethickmathspacethickmathspaceforshortutteranceswhere x denotes the i‐vector; μnormalL and μnormalS are mean vectors for i‐vector correspond to long and short utterances, respectively; ΦnormalL and ΦnormalS are the corresponding factor loading matrices; latent variable h follows Nfalse(0,Ifalse) and is shared by all the utterances from the same speaker. The residuals εnormalL and εnormalS are assumed to follow Nfalse(0,ΣnormalLfalse) and Nfalse(0,ΣnormalSfalse), respectively.…”
Section: Introductionmentioning
confidence: 99%
“…In TM G‐PLDA model, the latent variable is integrated out, which means that two sets of parameters bear the full burden of modelling the distribution mismatch (refer eqn. (12) in [4]). Consequently, the mismatch in i‐vector distributions is not explicitly normalised before scoring.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In recent years there has been increasing interest in short duration text independent speaker verification systems, almost all of which focuses on the aforementioned i-vector PLDA approach. A twin model G-PLDA [3] was proposed to compensate for the duration mismatch between i-vectors of long enrolment and short test utterances. The covariance of the i-vector posterior probability was propagated to the PLDA model in [4][5][6].…”
Section: Introductionmentioning
confidence: 99%