Mufan Sang scite author profile

Mufan Sang

5Publications

19Citation Statements Received

53Citation Statements Given

How they've been cited

How they cite others

101

Affiliations

The University of Texas at Dallas, The University of Texas at Austin

Publications

Order By: Most citations

Open-Set Short Utterance Forensic Speaker Verification Using Teacher-Student Network with Explicit Inductive Bias

Sang

Xia

Hansen

2020

View full text Add to dashboard Cite

In forensic applications, it is very common that only small naturalistic datasets consisting of short utterances in complex or unknown acoustic environments are available. In this study, we propose a pipeline solution to improve speaker verification on a small actual forensic field dataset. By leveraging large-scale out-of-domain datasets, a knowledge distillation based objective function is proposed for teacher-student learning, which is applied for short utterance forensic speaker verification. The objective function collectively considers speaker classification loss, Kullback-Leibler divergence, and similarity of embeddings. In order to advance the trained deep speaker embedding network to be robust for a small target dataset, we introduce a novel strategy to fine-tune the pre-trained student model towards a forensic target domain by utilizing the model as a finetuning start point and a reference in regularization. The proposed approaches are evaluated on the 1 st 48-UTD forensic corpus, a newly established naturalistic dataset of actual homicide investigations consisting of short utterances recorded in uncontrolled conditions. We show that the proposed objective function can efficiently improve the performance of teacher-student learning on short utterances and that our fine-tuning strategy outperforms the commonly used weight decay method by providing an explicit inductive bias towards the pre-trained model.

show abstract

Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization

Sang

Liu

et al. 2022

View full text Add to dashboard Cite

DEAAN: Disentangled Embedding and Adversarial Adaptation Network for Robust Speaker Representation Learning

Sang¹,

Xia²,

Hansen³

2021

View full text Add to dashboard Cite

Despite speaker verification has achieved significant performance improvement with the development of deep neural networks, domain mismatch is still a challenging problem in this field. In this study, we propose a novel framework to disentangle speaker-related and domain-specific features and apply domain adaptation on the speaker-related feature space solely. Instead of performing domain adaptation directly on the feature space where domain information is not removed, using disentanglement can efficiently boost adaptation performance. To be specific, our model's input speech from the source and target domains is first encoded into different latent feature spaces. The adversarial domain adaptation is conducted on the shared speaker-related feature space to encourage the property of domain-invariance. Further, we minimize the mutual information between speaker-related and domain-specific features for both domains to enforce the disentanglement. Experimental results on the VOiCES dataset demonstrate that our proposed framework can effectively generate more speaker-discriminative and domain-invariant speaker representations with a relative 20.3% reduction of EER compared to the original ResNet-based system.

show abstract

Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization

Sang¹,

Li²,

Liu³

et al. 2021

Preprint

View full text Add to dashboard Cite

Improving Transformer-Based Networks with Locality for Automatic Speaker Verification

Sang

Zhao

Liu

et al. 2023

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mufan Sang

Open-Set Short Utterance Forensic Speaker Verification Using Teacher-Student Network with Explicit Inductive Bias

Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization

DEAAN: Disentangled Embedding and Adversarial Adaptation Network for Robust Speaker Representation Learning

Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization

Improving Transformer-Based Networks with Locality for Automatic Speaker Verification

Contact Info

Product

Resources

About