Long-term pulmonary sequelae of theWilson-Mikity syndrome

Speech enhancement for drone audition is made challenging by the strong ego-noise from the rotating motors and propellers, which leads to extremely low signal-to-noise ratios (e.g. SNR < -15 dB) at onboard microphones. In this paper, we extensively assess the ability of single-channel deep learning approaches to ego-noise reduction on drones. We train twelve representative deep neural network (DNN) models, covering three operation domains (time-frequency magnitude domain, time-frequency complex domain and end-to-end time domain) and three distinct architectures (sequential, encoder-decoder and generative). We critically discuss and compare the performance of these models in extremely low-SNR scenarios, ranging from -30 to 0 dB. We show that time-frequency complex domain and UNet encoderdecoder architectures outperform other approaches on speech enhancement measures while providing a good trade-off with other criteria, such as model size, computation complexity and context length. Specifically, the best-performing model is DCUNet, a UNet model operating in the time-frequency complex domain, which, at input SNR -15 dB, improves ESTOI from 0.1 to 0.4, PESQ from 1.0 to 1.9 and SI-SDR from -15 dB to 3.7 dB. Based on the insights drawn from these findings, we discuss future research in drone ego-noise reduction.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dmitrii Mukhutdinov

Deep Learning Models for Single-Channel Speech Enhancement on Drones

Contact Info

Product

Resources

About