Nik Vaessen scite author profile

Nik Vaessen

4Publications

19Citation Statements Received

88Citation Statements Given

How they've been cited

How they cite others

Affiliations

Radboud University Nijmegen

Publications

Order By: Most citations

Fine-Tuning Wav2Vec2 for Speaker Recognition

Vaessen

Leeuwen

2022

View full text Add to dashboard Cite

This paper explores applying the wav2vec2 framework to speaker recognition instead of speech recognition. We study the effectiveness of the pre-trained weights on the speaker recognition task, and how to pool the wav2vec2 output sequence into a fixed-length speaker embedding. To adapt the framework to speaker recognition, we propose a singleutterance classification variant with cross-entropy or additive angular softmax loss, and an utterance-pair classification variant with BCE loss. Our best performing variant achieves a 1.88% EER on the extended voxceleb1 test set compared to 1.69% EER with an ECAPA-TDNN baseline. Code is available at github.com/nikvaessen/w2v2-speaker.

show abstract

Training speaker recognition systems with limited data

Vaessen¹,

Leeuwen²

2022

View full text Add to dashboard Cite

This work considers training neural networks for speaker recognition with a much smaller dataset size compared to contemporary work. We artificially restrict the amount of data by proposing three subsets of the popular VoxCeleb2 dataset. These subsets are restricted to 50 k audio files (versus over 1 M files available), and vary on the axis of number of speakers and session variability. We train three speaker recognition systems on these subsets; the X-vector, ECAPA-TDNN, and wav2vec2 network architectures. We show that the self-supervised, pre-trained weights of wav2vec2 substantially improve performance when training data is limited. Code and data subsets are available at https://github.com/ nikvaessen/w2v2-speaker-few-samples.

show abstract

Beyond Neural-on-Neural Approaches to Speaker Gender Protection

Bemmel

Liu²,

Vaessen³

et al. 2023

View full text Add to dashboard Cite

Recent research has proposed approaches that modify speech to defend against gender inference attacks. The goal of these protection algorithms is to control the availability of information about a speaker's gender, a privacy-sensitive attribute. Currently, the common practice for developing and testing gender protection algorithms is "neural-on-neural", i.e., perturbations are generated and tested with a neural network. In this paper, we propose to go beyond this practice to strengthen the study of gender protection. First, we demonstrate the importance of testing gender inference attacks that are based on speech features historically developed by speech scientists, alongside the conventionally used neural classifiers. Next, we argue that researchers should use speech features to gain insight into how protective modifications change the speech signal. Finally, we point out that gender-protection algorithms should be compared with novel "vocal adversaries", humanexecuted voice adaptations, in order to improve interpretability and enable before-the-mic protection.

show abstract

Speaker and Language Change Detection using Wav2vec2 and Whisper

Berns¹,

Vaessen²,

Leeuwen³

2023

Preprint

View full text Add to dashboard Cite

We investigate recent transformer networks pre-trained for automatic speech recognition for their ability to detect speaker and language changes in speech. We do this by simply adding speaker (change) or language targets to the labels. For Wav2vec2 pre-trained networks, we also investigate if the representation for the speaker change symbol can be conditioned to capture speaker identity characteristics. Using a number of constructed data sets we show that these capabilities are definitely there, with speaker recognition equal error rates of the order of 10 % and language detection error rates of a few percent. We will publish the code for reproducibility.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nik Vaessen

Fine-Tuning Wav2Vec2 for Speaker Recognition

Training speaker recognition systems with limited data

Beyond Neural-on-Neural Approaches to Speaker Gender Protection

Speaker and Language Change Detection using Wav2vec2 and Whisper

Contact Info

Product

Resources

About