<i>In vivo</i> imaging of proteasome inhibition using a proteasome‐sensitive fluorescent reporter

The recently proposed VBx diarization method uses a Bayesian hidden Markov model to find speaker clusters in a sequence of x-vectors. In this work we perform an extensive comparison of performance of the VBx diarization with other approaches in the literature and we show that VBx achieves superior performance on three of the most popular datasets for evaluating diarization: CALLHOME, AMI and DIHARDII datasets. Further, we present for the first time the derivation and update formulae for the VBx model, focusing on the efficiency and simplicity of this model as compared to the previous and more complex BHMM model working on frame-by-frame standard Cepstral features.Together with this publication, we release the recipe for training the x-vector extractors used in our experiments on both wide and narrowband data, and the VBx recipes that attain state-of-the-art performance on all three datasets. Besides, we point out the lack of a standardized evaluation protocol for AMI dataset and we propose a new protocol for both Beamformed and Mix-Headset audios based on the official AMI partitions and transcriptions.

show abstract

Enhancement and Analysis of Conversational Speech: JSALT 2017

Ryanta

Bergelson

Church³

et al. 2018

View full text Add to dashboard Cite

BUT VOiCES 2019 System Description

Zeinali¹,

Matějka²,

Mošner³

et al. 2019

Preprint

View full text Add to dashboard Cite

This is a description of our effort in VOiCES 2019 Speaker Recognition challenge. All systems in the fixed condition are based on the x-vector paradigm with different features and DNN topologies. The single best system reaches 1.2% EER and a fusion of 3 systems yields 1.0% EER, which is 15% relative improvement. The open condition allowed us to use external data which we did for the PLDA adaptation and achieved less than 10% relative improvement. In the submission to open condition, we used 3 x-vector systems and also one i-vector based system.

show abstract

Analysis of X-Vectors for Low-Resource Speech Recognition

Karafiát

Veselý

Černocký

et al. 2021

View full text Add to dashboard Cite

The paper presents a study of usability of x-vectors for adaptation of automatic speech recognition (ASR) systems. Xvectors are Neural Network (NN)-based speaker embeddings recently proposed in speaker recognition (SR). They quickly replaced common i-vectors and became new state-of-the-art technique. Here, the same approach is adopted for ASR with the hope of similar outcome. All experiments were done on ASR for the latest IARPA MATERIAL evaluation running on Pashto language. Over 1% absolute improvement was observed with x-vectors over traditional i-vectors, even when the x-vector extractor was not trained on target Pashto data.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ján Profant

Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks

Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks

Enhancement and Analysis of Conversational Speech: JSALT 2017

BUT VOiCES 2019 System Description

Analysis of X-Vectors for Low-Resource Speech Recognition

Contact Info

Product

Resources

About