Tyler Vuong scite author profile

Tyler Vuong

4Publications

5Citation Statements Received

93Citation Statements Given

How they've been cited

How they cite others

Affiliations

Carnegie Mellon University

Publications

Order By: Most citations

Learnable Spectro-Temporal Receptive Fields for Robust Voice Type Discrimination

Vuong

Xia

Stern

2020

View full text Add to dashboard Cite

Voice Type Discrimination (VTD) refers to discrimination between regions in a recording where speech was produced by speakers that are physically within proximity of the recording device ("Live Speech") from speech and other types of audio that were played back such as traffic noise and television broadcasts ("Distractor Audio"). In this work, we propose a deep-learning-based VTD system that features an initial layer of learnable spectro-temporal receptive fields (STRFs). Our approach is also shown to provide very strong performance on a similar spoofing detection task in the ASVspoof 2019 challenge. We evaluate our approach on a new standardized VTD database that was collected to support research in this area. In particular, we study the effect of using learnable STRFs compared to static STRFs or unconstrained kernels. We also show that our system consistently improves a competitive baseline system across a wide range of signal-to-noise ratios on spoofing detection in the presence of VTD distractor noise.

show abstract

A Modulation-Domain Loss for Neural-Network-Based Real-Time Speech Enhancement

Vuong

Xia

Stern

2021

View full text Add to dashboard Cite

We describe a modulation-domain loss function for deeplearning-based speech enhancement systems. Learnable spectro-temporal receptive fields (STRFs) were adapted to optimize for a speaker identification task. The learned STRFs were then used to calculate a weighted mean-squared error (MSE) in the modulation domain for training a speech enhancement system. Experiments showed that adding the modulation-domain MSE to the MSE in the spectro-temporal domain substantially improved the objective prediction of speech quality and intelligibility for real-time speech enhancement systems without incurring additional computation during inference.

show abstract

Generalized Spoofing Detection Inspired from Audio Generation Artifacts

Gao

Vuong

Elyasi

et al. 2021

View full text Add to dashboard Cite

A Modulation-Domain Loss for Neural-Network-based Real-time Speech Enhancement

Vuong¹,

Xia²,

Stern³

2021

Preprint

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tyler Vuong

Learnable Spectro-Temporal Receptive Fields for Robust Voice Type Discrimination

A Modulation-Domain Loss for Neural-Network-Based Real-Time Speech Enhancement

Generalized Spoofing Detection Inspired from Audio Generation Artifacts

A Modulation-Domain Loss for Neural-Network-based Real-time Speech Enhancement

Contact Info

Product

Resources

About