Mixture of PLDA for Noise Robust I-Vector Speaker Verification

Mak, Man‐Wai; Pang, Xiaomin; Chien, Jen‐Tzung

doi:10.1109/taslp.2015.2499038

Cited by 66 publications

(40 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In [8], a mixture of channel-dependent PLDA models are trained to account for the channel conditions of each test utterance presented at the detection phase. Another mixture of PLDA Models is presented in [9], where each one is trained with different levels of noise and used according to the signal-to-noise ratio of the test utterance. The work in [10] assessed one channel features-domain noise compensation combined with multi-condition training.…”

Section: Channel Synthesis and Related Workmentioning

confidence: 99%

Channel Variability Synthesis in i-vector Speaker Recognition

Ahmed

Chiverton²,

Ndzi

et al. 2017

IET 3rd International Conference on Intelligent Signal Processing (ISP 2017)

View full text Add to dashboard Cite

In this paper, we are tackling a practical problem which can be faced when establishing an i-vector speaker recognition system with limited resources. This addresses the problem of lack of development data of multiple recordings for each speaker. When we only have one recording for each speaker in the development set, phonetic variability can be simply synthesised by dividing the recordings if they are of sufficient length. For channel variability, we pass the recordings through a Gaussian channel to produce another set of recordings, referred to here as Gaussian version recordings. The proposed method for channel variability synthesis produces total relative improvements in EER of 5%.

show abstract

Section: Channel Synthesis and Related Workmentioning

confidence: 99%

Channel Variability Synthesis in i-vector Speaker Recognition

Ahmed

Chiverton²,

Ndzi

et al. 2017

IET 3rd International Conference on Intelligent Signal Processing (ISP 2017)

View full text Add to dashboard Cite

show abstract

“…In addition to comparing these histograms, a second measure of differences between the distribution of i-vectors from long and short utterances based on the Partition Coefficient [15,16] is employed in this paper. The partition coefficient is an index that indicates the clustering tendency in a dataset and lies in the range [1/݇, 1] , where ݇ is the number of clusters.…”

Section: Duration Mismatch In I-vectorsmentioning

confidence: 99%

Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification

Sethu

Ambikairajah

et al. 2016

Interspeech 2016

View full text Add to dashboard Cite

Short duration speaker verification is a challenging problem partly due to utterance duration mismatch. This paper proposes a novel method that modifies the standard Gaussian probabilistic linear discriminant analysis (G-PLDA) to use two separate generative models for i-vectors from long and short utterances which are jointly trained. The proposed twin model G-PLDA employs distinct models for i-vectors corresponding to different durations from the same speaker but shares the same latent variables. Unlike the standard G-PLDA, this twin model G-PLDA takes the differences between utterances of varying durations into account. Hyper-parameter estimation and scoring formulae for the twin model G-PLDA are presented. Experimental results obtained using NIST 2010 data show that the proposed technique leads to relative improvements of 8.5% and 15.6% when tested on utterances of 5 second and 3 second durations respectively.

show abstract

“…The approach was shown to outperform standard PLDA with pooled training data when each class in the training data is seen under both considered conditions, frontal and profile, in a face recognition task. A similar approach is proposed by [12]; but in this case, the mixture component is not given during training but rather is dependent on a continuous metadata value. The approach is tested by adding noise to the training data at different signal-to-noise (SNR) levels, resulting in gains compared to pooling all the data to train a single PLDA model.…”

Section: Introductionmentioning

confidence: 99%

A Generalization of PLDA for Joint Modeling of Speaker Identity and Multiple Nuisance Conditions

Ferrer

McLaren

2018

Interspeech 2018

View full text Add to dashboard Cite

Probabilistic linear discriminant analysis (PLDA) is the leading method for computing scores in speaker recognition systems. The method models the vectors representing each audio sample as a sum of three terms: one that depends on the speaker identity, one that models the within-speaker variability, and one that models any remaining variability. The last two terms are assumed to be independent across samples. We recently proposed an extension of the PLDA method, which we termed Joint PLDA (JPLDA), where the second term is considered dependent on the type of nuisance condition present in the data (e.g., the language or channel). The proposed method led to significant gains for multilanguage speaker recognition when taking language as the nuisance condition. In this paper, we present a generalization of this approach that allows for multiple nuisance terms. We show results using language and several nuisance conditions describing the acoustic characteristics of the sample and demonstrate that jointly including all these factors in the model leads to better results than including only language or acoustic condition factors. Overall, we obtain relative improvements in detection cost function between 5% and 47% for various systems and test conditions with respect to standard PLDA approaches.

show abstract

Mixture of PLDA for Noise Robust I-Vector Speaker Verification

Cited by 66 publications

References 33 publications

Channel Variability Synthesis in i-vector Speaker Recognition

Channel Variability Synthesis in i-vector Speaker Recognition

Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification

A Generalization of PLDA for Joint Modeling of Speaker Identity and Multiple Nuisance Conditions

Contact Info

Product

Resources

About