Yash Wani scite author profile

Serum and saliva-based testing methods have been crucial to slowing the COVID-19 pandemic, yet have been limited by slow throughput and cost. A system able to determine COVID-19 status from cough sounds alone would provide a low cost, rapid, and remote alternative to current testing methods. We explore the applicability of recent techniques such as pre-training and spectral augmentation in improving the performance of a neural cough classification system. We use Autoregressive Predictive Coding (APC) to pre-train a unidirectional LSTM on the COUGHVID dataset. We then generate our final model by finetuning added BLSTM layers on the DiCOVA challenge dataset. We perform various ablation studies to see how each component impacts performance and improves generalization with a small dataset. Our final system achieves an AUC of 85.35 and places third out of 29 entries in the DiCOVA challenge.

show abstract

Estimation of Respiratory Rate from Breathing Audio

Harvill

Wani

Alam

et al. 2022

View full text Add to dashboard Cite

The COVID-19 pandemic has fueled exponential growth in the adoption of remote delivery of primary, specialty, and urgent health care services. One major challenge is the lack of access to physical exam including accurate and inexpensive measurement of remote vital signs. Here we present a novel method for machine learning-based estimation of patient respiratory rate from audio. There exist non-learning methods but their accuracy is limited and work using machine learning known to us is either not directly useful or uses non-public datasets. We are aware of only one publicly available dataset which is small and which we use to evaluate our algorithm. However, to avoid the overfitting problem, we expand its effective size by proposing a new data augmentation method. Our algorithm uses the spectrogram representation and requires labels for breathing cycles, which are used to train a recurrent neural network for recognizing the cycles. Our augmentation method exploits the independence property of the most periodic frequency components of the spectrogram and permutes their order to create multiple signal representations. Our experiments show that our method almost halves the errors obtained by the existing (non-learning) methods.Clinical relevance-We achieve a Mean Absolute Error (MAE) of 1.0 for the respiratory rate while relying only on an audio signal of a patient breathing. This signal can be collected from a smartphone such that physicians can automatically and reliably determine respiratory rate in a remote setting.

show abstract

Pipe Overflow: Smashing Voice Authentication for Fun and Profit

Ahmed¹,

Wani²,

Shamsabadi³

et al. 2022

Preprint

View full text Add to dashboard Cite

Recent years have seen a surge of popularity of acoustics-enabled personal devices powered by machine learning. Yet, machine learning has proven to be vulnerable to adversarial examples. Large number of modern systems protect themselves against such attacks by targeting the artificiality, i.e., they deploy mechanisms to detect the lack of human involvement in generating the adversarial examples. However, these defenses implicitly assume that humans are incapable of producing meaningful and targeted adversarial examples. In this paper, we show that this base assumption is wrong. In particular, we demonstrate that for tasks like speaker identification, a human is capable of producing analog adversarial examples directly with little cost and supervision: by simply speaking through a tube, an adversary reliably impersonates other speakers in eyes of ML models for speaker identification. Our findings extend to a range of other acoustic-biometric tasks such as liveness, bringing into question their use in security-critical settings in real life, such as phone banking.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yash Wani

Vision automatically exerts online and offline influences on bimanual tactile spatial perception

Classification of COVID-19 from Cough Using Autoregressive Predictive Coding Pretraining and Spectral Data Augmentation

Estimation of Respiratory Rate from Breathing Audio

Pipe Overflow: Smashing Voice Authentication for Fun and Profit

Contact Info

Product

Resources

About