Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition

Cumani, Sandro; Laface, Pietro

doi:10.1109/taslp.2017.2674966

Cited by 20 publications

(26 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The non-linear i-vector transformation model proposed in [24], [25] assumes that i-vectors are independently sampled from a standard normal distribution, and independently transformed by means of an invertible non-linear function f −1 :…”

Section: Density Function Transformationsmentioning

confidence: 99%

“…In particular, to fit a Gaussian distribution, in [24], [25] we make use of the affine transformation defined as:…”

Section: Density Function Transformationsmentioning

confidence: 99%

“…In [25] PLDA classification based on this "gaussianized" i-vectors (AS-PLDA) has been successfully tested on the NIST SRE-2010 and SRE-2012 evaluation datasets showing a relative improvement between 7% and 14% of their Detection Cost Function with respect to the use of standard i-vectors. Crucial to the success of this approach was the reduction of the mismatch between the development and evaluation i-vector length distributions by means of an i-vector dependent scaling factor, which we detail in Section VI.…”

Section: Density Function Transformationsmentioning

confidence: 99%

“…Although this assumption is not too realistic, it is consistent with the i-vector extraction processing, where all utterances are considered independent. Thus, we have recently proposed to transform the development i-vectors so that their distribution becomes more Gaussian-like [24], [25]. This transformation is obtained by means of a sequence of affine and non-linear transformations whose parameters are trained by Maximum Likelihood (ML) estimation on the development set.…”

Section: Introductionmentioning

confidence: 99%

“…The paper is organized as follows: In Section II we briefly recall the non-linear transformations proposed in [24], [25]. Section III introduces our new generative non-linear PLDA model, which allows us to embed the estimation of the nonlinear i-vector transformation in the PLDA model estimation.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Joint Estimation of PLDA and Nonlinear Transformations of Speaker Vectors

Cumani

Laface

2017

IEEE/ACM Trans. Audio Speech Lang. Process.

Self Cite

View full text Add to dashboard Cite

The Gaussian Probabilistic Linear Discriminant Analysis (PLDA) model assumes Gaussian distributed priors for the latent variables that represent the speaker and channel factors. Assuming that each training i-vector belongs to a different speaker, as is usually done in i-vector extraction, i-vectors generated by a PLDA model can be considered independent and identically distributed with Gaussian distribution. Thus, we have recently proposed to transform the development i-vectors so that their distribution becomes more Gaussian-like. This is obtained by means of a sequence of affine and non-linear transformations whose parameters are trained by Maximum Likelihood (ML) estimation on the development set. The evaluation i-vectors are then subject to the same transformation. Although the i-vector "gaussianization" has shown to be effective, since the i-vectors extracted from segments of the same speaker are not independent, the original assumption is not satisfactory. In this work we show that the model can be improved by properly exploiting the information about the speaker labels, which was ignored in the previous model. In particular, a more effective PLDA model can be obtained by jointly estimating the PLDA parameters and the parameters of the non-linear transformation of the i-vectors. In other words, while the goal of the previous approach was to "gaussianize" the training i-vectors distribution, the objective of this work is to embed the estimation of the non-linear i-vector transformation in the PLDA model estimation. We will thus refer to this model as the non-linear PLDA model (NL-PLDA). We show that this new approach provides significant gain with respect to PLDA, and a small, yet consistent, improvement with respect to our former i-vector "gaussianization" approach, without further additional costs.

show abstract

Section: Density Function Transformationsmentioning

confidence: 99%

“…In particular, to fit a Gaussian distribution, in [24], [25] we make use of the affine transformation defined as:…”

Section: Density Function Transformationsmentioning

confidence: 99%

Section: Density Function Transformationsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%