Shuning Zhao scite author profile

Shuning Zhao

2Publications

17Citation Statements Received

179Citation Statements Given

How they've been cited

How they cite others

172

Affiliations

Tsinghua University

Publications

Order By: Most citations

Attack on Practical Speaker Verification System Using Universal Adversarial Perturbations

Zhang

Zhao

Liu³

et al. 2021

View full text Add to dashboard Cite

In authentication scenarios, applications of practical speaker verification systems usually require a person to read a dynamic authentication text. Previous studies played an audio adversarial example as a digital signal to perform physical attacks, which would be easily rejected by audio replay detection modules. This work shows that by playing our crafted adversarial perturbation as a separate source when the adversary is speaking, the practical speaker verification system will misjudge the adversary as a target speaker. A two-step algorithm is proposed to optimize the universal adversarial perturbation to be text-independent and has little effect on the authentication text recognition. We also estimated room impulse response (RIR) in the algorithm which allowed the perturbation to be effective after being played over the air. In the physical experiment, we achieved targeted attacks with success rate of 100%, while the word error rate (WER) on speech recognition was only increased by 3.55%. And recorded audios could pass replay detection for the live person speaking.

show abstract

Predicting crowdfunding success with visuals and speech in video ads and text ads

Al-Qershi

Kwon

Zhao

et al. 2022

EJM

View full text Add to dashboard Cite

Purpose For the case of many content features, This paper aims to investigate which content features in video and text ads more contribute to accurately predicting the success of crowdfunding by comparing prediction models. Design/methodology/approach With 1,368 features extracted from 15,195 Kickstarter campaigns in the USA, the authors compare base models such as logistic regression (LR) with tree-based homogeneous ensembles such as eXtreme gradient boosting (XGBoost) and heterogeneous ensembles such as XGBoost + LR. Findings XGBoost shows higher prediction accuracy than LR (82% vs 69%), in contrast to the findings of a previous relevant study. Regarding important content features, humans (e.g. founders) are more important than visual objects (e.g. products). In both spoken and written language, words related to experience (e.g. eat) or perception (e.g. hear) are more important than cognitive (e.g. causation) words. In addition, a focus on the future is more important than a present or past time orientation. Speech aids (see and compare) to complement visual content are also effective and positive tone matters in speech. Research limitations/implications This research makes theoretical contributions by finding more important visuals (human) and language features (experience, perception and future time). Also, in a multimodal context, complementary cues (e.g. speech aids) across different modalities help. Furthermore, the noncontent parts of speech such as positive “tone” or pace of speech are important. Practical implications Founders are encouraged to assess and revise the content of their video or text ads as well as their basic campaign features (e.g. goal, duration and reward) before they launch their campaigns. Next, overly complex ensembles may suffer from overfitting problems. In practice, model validation using unseen data is recommended. Originality/value Rather than reducing the number of content feature dimensions (Kaminski and Hopp, 2020), by enabling advanced prediction models to accommodate many contents features, prediction accuracy rises substantially.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shuning Zhao

Attack on Practical Speaker Verification System Using Universal Adversarial Perturbations

Predicting crowdfunding success with visuals and speech in video ads and text ads

Contact Info

Product

Resources

About