Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-1302
|View full text |Cite
|
Sign up to set email alerts
|

Probabilistic Approach Using Joint Long and Short Session i-Vectors Modeling to Deal with Short Utterances for Speaker Recognition

Abstract: Speaker recognition with short utterance is highly challenging. The use of i-vectors in SR systems became a standard in the last years and many algorithms were developed to deal with the short utterances problem. We present in this paper a new technique based on modeling jointly the i-vectors corresponding to short utterances and those of long utterances. The joint distribution is estimated using a large number of i-vectors pairs (coming from short and long utterances) corresponding to the same session. The ob… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2017
2017
2018
2018

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 14 publications
0
10
0
Order By: Relevance
“…Previous work (Kheder et al, 2016;Guo et al, 2017b) highlights the importance of duration matching in PLDA model training. For instance when the PLDA is trained using long utterances and evaluated on short utterances, there is degradation in speaker verification performance compared to PLDA trained using matched-length short utterances.…”
Section: I-vector Mapping Resultsmentioning
confidence: 98%
See 1 more Smart Citation
“…Previous work (Kheder et al, 2016;Guo et al, 2017b) highlights the importance of duration matching in PLDA model training. For instance when the PLDA is trained using long utterances and evaluated on short utterances, there is degradation in speaker verification performance compared to PLDA trained using matched-length short utterances.…”
Section: I-vector Mapping Resultsmentioning
confidence: 98%
“…A few recent papers have focused on i-vector mapping, which maps the short utterance i-vector to its long version. In Kheder et al (2016Kheder et al ( , 2018, the authors proposed a probabilistic approach, in which a GMM-based joint model between long and short utterance i-vectors was trained, and a minimum mean square error (MMSE) estimator was applied to transform a short i-vector to its long version. Since the GMM-based mapping function is actually a weighted sum of linear functions, our previous research (Guo et al, 2017b) demonstrates that a proposed non-linear mapping using convolutional neural networks (CNNs) outperforms the GMM-based linear mapping methods across different conditions.…”
Section: Introductionmentioning
confidence: 99%
“…In the following two subsections we describe the GMM-based mapping algorithm as in [10] and our proposed CNN-based mapping algorithm in details.…”
Section: I-vector Mapping Between Short and Long Sessionsmentioning
confidence: 99%
“…(4), we can observe that the GMM-based mapping is actually a weighted sum of linear functions, and the weights are the conditional probabilities of each Gaussian component given test utterance x0. Even though the GMM-based joint modeling method gives significant improvement for the mismatched condition between short and long session i-vectors [10], there are still some shortcomings of this method. Learning a mapping from short session i-vector to its long version, is a very complex and nonlinear transform.…”
Section: Gmm-based Joint Probability Modelmentioning
confidence: 99%
See 1 more Smart Citation