Mahalanobis Metric Scoring Learned from Weighted Pairwise Constraints in I-Vector Speaker Recognition System

Lei, Zhenchun; Wan, Yanhong; Luo, Jian; Yang, Yukun

doi:10.21437/interspeech.2016-1071

Cited by 4 publications

(4 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Another study reported [32] that Cosine or Euclidean scoring methods provide a significant improvement than PLDA. The effectiveness of the Mahalanobis scoring method has been explored by [37,38] and presented an excellent performance for the i-vector system in the speaker recognition system. In this paper, we assess the effectiveness of the speaker verification system in different scoring methods, such as Cosine similarity scoring (CSS), Euclidean distance scoring (EDS), and Mahalanobis distance scoring (MDS).…”

Section: Scoring Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions

2020

View full text Add to dashboard Cite

Emotional conditions cause changes in the speech production system. It produces the differences in the acoustical characteristics compared to neutral conditions. The presence of emotion makes the performance of a speaker verification system degrade. In this paper, we propose a speaker modeling that accommodates the presence of emotions on the speech segments by extracting a speaker representation compactly. The speaker model is estimated by following a similar procedure to the i-vector technique, but it considerate the emotional effect as the channel variability component. We named this method as the emotional variability analysis (EVA). EVA represents the emotion subspace separately to the speaker subspace, like the joint factor analysis (JFA) model. The effectiveness of the proposed system is evaluated by comparing it with the standard i-vector system in the speaker verification task of the Speech Under Simulated and Actual Stress (SUSAS) dataset with three different scoring methods. The evaluation focus in terms of the equal error rate (EER). In addition, we also conducted an ablation study for a more comprehensive analysis of the EVA-based i-vector. Based on experiment results, the proposed system outperformed the standard i-vector system and achieved state-of-the-art results in the verification task for the under-stressed speakers.

show abstract

Section: Scoring Methodsmentioning

confidence: 99%

“…The effectiveness of the Mahalanobis metric for speaker detection scoring has been proven by [37,38]. The score between two i-vectors w target and w test is proportional to the log-probability that both i-vectors belong to a unique class following the covariance matrix τ.…”

Section: Scoring Methodsmentioning

confidence: 99%

Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions

2020

View full text Add to dashboard Cite

show abstract

“…The development data with short speech segments can be used to train the metric, and subsequently the learned metric can be used for measuring the similarity of i ‐vectors in a pairwise manner. Previous studies in distance metric method in SV do not focus on short‐utterance problem [116, 117]; however, the approach could be explored further to investigate the similarity between i ‐vectors of small speech segments. Sparse methods: The limited data conditions in SV lead to sparsity in sufficient statistics estimation which is successively used in i ‐vector estimation [48, 98]. Methods developed to handle the sparsity issue such as dictionary learning [118], sparse representation [119] can be investigated to effectively process and represent the speech data for short utterances.…”

Section: Future Research Directionsmentioning

confidence: 99%

Section: Future Research Directionsmentioning

confidence: 99%

Speaker verification with short utterances: a review of challenges, trends and opportunities

2017

View full text Add to dashboard Cite

Automatic speaker verification (ASV) technology now reports a reasonable level of accuracy in its applications in voice-based biometric systems. However, it requires adequate amount of speech data for enrolment and verification; otherwise, the performance becomes considerably degraded. For this reason, the trade-off between the convenience and security is difficult to maintain in practical scenarios. The utterance duration remains a critical issue while deploying a voice biometric system in real-world applications. A large amount of research work has been carried out to address the limited data issue within the scope of SV. The advancements and research activities in mitigating the challenges due to short utterance have seen a significant rise in recent times. In this study, the authors present an extensive survey of SV with short utterances considering the studies from recent past and include latest research offering various solutions and analyses. The review also summarises the major findings of the studies of duration variability problem in ASV systems. Finally, they discuss a number of possible future directions promoting further research in this field. 2 Brief overview of ASV An ASV system includes three fundamental modules [1, 2]: a feature extraction unit, which transforms the speech signal in a compact form, a statistical modelling unit to characterise the extracted features, and finally a classification module to classify a test speech. 2.1 Feature extraction approaches The state-of-the-art ASV systems use three major types of feature extraction techniques: sub-segmental, segmental and suprasegmental analyses. Speech signals analysed using the frame size

show abstract

Weighted X-Vectors for Robust Text-Independent Speaker Verification with Multiple Enrollment Utterances

Mohammadi

2022

Circuits Syst Signal Process

View full text Add to dashboard Cite

Mahalanobis Metric Scoring Learned from Weighted Pairwise Constraints in I-Vector Speaker Recognition System

Cited by 4 publications

References 10 publications

Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions

Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions

Speaker verification with short utterances: a review of challenges, trends and opportunities

Weighted X-Vectors for Robust Text-Independent Speaker Verification with Multiple Enrollment Utterances

Contact Info

Product

Resources

About