On the Use of PLDA i-vector Scoring for Clustering Short Segments

Salmun, Itay; Opher, Irit; Lapidot, Itshak

doi:10.21437/odyssey.2016-59

Cited by 16 publications

(17 citation statements)

References 15 publications

(22 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Mean-shift algorithm for clustering is well known [2,3], where the standard algorithm is based on Euclidean distance. As Euclidean distance is not fit to work well with i-vectors, it was first replaced by cosine distance [4,5] and later with PLDA score [6]. Another change to the standard algorithm is replacing the threshold h that determines the neighboring i-vectors, by k-nearest neighbors (kNN).…”

Section: Mean-shift Algorithmmentioning

confidence: 99%

“…Another change to the standard algorithm is replacing the threshold h that determines the neighboring i-vectors, by k-nearest neighbors (kNN). It was found in that kNN is much less sensitive to the k value then the h parameter [6]. Let X = {xj} J j=1 be a set of i-vectors from several speakers, and let S h (x) be the set of the k nearest i-vectors, then the mean shift is given in eq.…”

Section: Mean-shift Algorithmmentioning

confidence: 99%

“…(2). As the main goal of this paper is to qualify the clustering and not to find the best clustering algorithm, we use the system described in [6]. It was found that PCA works better than LDA for dimensionality reduction before the PLDA scoring.…”

Section: Algorithm 1 Mean-shift Clustering Algorithmmentioning

confidence: 99%

“…In this paper we rely on our previous work using meanshift with probabilistic linear discriminant analysis (PLDA) score [6]. In the current work we did not focus on improving the clustering algorithm, rather we focused on constructing a system that estimates the clustering quality.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Estimating Speaker Clustering Quality Using Logistic Regression

Cohen¹,

Lapidot

2017

Interspeech 2017

Self Cite

View full text Add to dashboard Cite

This paper focuses on estimating clustering validity by using logistic regression. For many applications it might be important to estimate the quality of the clustering, e.g. in case of speech segments' clustering, make a decision whether to use the clustered data for speaker verification. In the case of short segments speakers clustering, the common criteria for cluster validity are average cluster purity (ACP), average speaker purity (ASP) and K -the geometric mean between the two measures. As in practice, true labels are not available for evaluation, hence they have to be estimated from the clustering itself. In this paper, meanshift clustering with PLDA score is applied in order to cluster short speaker segments represented as i-vectors. Different statistical parameters are then estimated on the clustered data and are used to train logistic regression to estimate ACP, ASP and K. It was found that logistic regression can be a good predictor of the actual ACP, ASP and K, and yields reasonable information regarding the clustering quality. Index Terms: Cluster validity, Logistic Regression, I-vectors, Mean-shift, PLDA.As for the proposed mean-shift, a PLDA score is required, so before performing clustering we train the universal background model (UBM) and the total variability (TV) matrix for i-vectors extraction. We also train the PCA matrix T , and the

show abstract

Section: Mean-shift Algorithmmentioning

confidence: 99%

Section: Mean-shift Algorithmmentioning

confidence: 99%

Section: Algorithm 1 Mean-shift Clustering Algorithmmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Estimating Speaker Clustering Quality Using Logistic Regression

Cohen¹,

Lapidot

2017

Interspeech 2017

Self Cite

View full text Add to dashboard Cite

show abstract

“…Speaker clustering is an unsupervised task of identifying which segments from a set of speech segments belong to the same speaker. It can be an inherent part in speaker diarization task [1], or it can be also a stand alone problem [2]. In this work the segments are well defined by a push to talk (PTT) button.…”

Section: Introductionmentioning

confidence: 99%

Incremental On-Line Clustering of Speakers' Short Segments

Aloni-Lavi¹,

Opher²,

Lapidot³

2018

The Speaker and Language Recognition Workshop (Odyssey 2018)

Self Cite

View full text Add to dashboard Cite

This paper deals with clustering of speakers' short segments, in a scenario where additional segments continue to arrive and should be constantly clustered together with previous segments that were already clustered. In realistic applications, it is not possible to cluster all segments every time a new segment arrives. Hence, incremental clustering is applied in an on-line mode. New segments can either belong to existing speakers, therefore, have to be assigned to one of the existing clusters, or they could belong to new speakers and thus new clusters should be formed. In this work we show that if there are enough segments per speaker in the off-line initial clustering process, it constitutes a good starting point for the incremental on-line clustering. In this case, incremental online clustering can be successfully applied based on the previously proposed mean-shift clustering algorithm with PLDA score as a similarity measure and with k-nearest neighbors (kNN) neighborhood selection.

show abstract