Multi-sensory sound source enhancement for unmanned aerial vehicle recordings

Yen, Benjamin P.‐C.; Hioka, Yusuke; Schmid, Gian; Mace, Brian

doi:10.1016/j.apacoust.2021.108590

Cited by 10 publications

(17 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…where tr[•] denotes the trace operator. Finally, using ( 12), (11) in (9) together with (3), we express the optimum filter weight that needs to be applied to each microphone outputs of a drone to remove drone noise as…”

Section: A Drone Noise Reductionmentioning

confidence: 99%

“…In [9], the authors proposed deep learning integrates single channel and multichannel TF spatial filtering approaches for speech enhancement on drones. In [10], [11], the authors proposed to use multi-sensory information of the drone motors and propellers to accurately estimate drone noise PSD together with microphone signal for speech enhancement. In [10], results are evaluated considering a single motor propeller combination.…”

Section: Introductionmentioning

confidence: 99%

“…In [10], results are evaluated considering a single motor propeller combination. In [11], a multi-sensory source enhancement framework was proposed for in-flight configuration. The method in [12] presented a partially-shared deep neural network with a small amount of training data.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Drone Audition: Audio Signal Enhancement from Drone Embedded Microphones

Manamperi¹,

Abhayapala²,

Samarasinghe³

et al. 2023

Preprint

View full text Add to dashboard Cite

<p>In this paper, we consider the problem of recovering speech from audio recordings of on-board microphones in a noisy drone platform. Enhancement of speech degraded by drone noise is considered to be a difficult task due to the strong noise generated from its motors and propellers causing an extremely low signal-to-drone noise ratio (SdNR). We propose a solution by (i) developing a multichannel Wiener filter (MWF) to remove drone noise from microphone recordings, and (ii) further reduction of residual noise using a Gaussian mixture model (GMM) based dual-stage parametric Wiener filter (WF). The method exploits the known statistics of motor current-specific drone noise. The theory developed is applicable to irregular microphone array embedded in a drone enabling realistic integration to most drones. We evaluate the proposed framework using two different drone acoustics datasets under extreme SdNR levels including −30 dB. The experimental results confirm promising performance in terms of SdNR improvement, speech quality (PESQ), and intelligibility (STOI) and show a strong potential for speech enhancement applications using noisy drones.</p>

show abstract

Section: A Drone Noise Reductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Drone Audition: Audio Signal Enhancement from Drone Embedded Microphones

Manamperi¹,

Abhayapala²,

Samarasinghe³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…This leads to degraded performance when signals are processed in short segments, and thus it is not efficient when dealing with moving drones and sound sources. Since the locations of the motors and propellers are fixed, some work proposed to minimize the influence of the ego-noise by mounting the microphone array far from the drone body using an extension pole [5], [26] or rope [16]. While these approaches reduce the effect of the ego-noise, the requirement of additional hardware reduces the versatility of the drone and hence of drone audition applications.…”

Section: Introductionmentioning

confidence: 99%

“…Since the ego-noise is generated by the motors and propellers, speed sensors can be used to monitor the motor rotation speed and predict the ego-noise. The predicted ego-noise is further incorporated into existing source localization algorithms to improve the robustness to ego-noise [26], [27]. Computer vision algorithms can exploit onboard cameras to localize candidate sound sources (e.g.…”

Section: Introductionmentioning

confidence: 99%

Deep-Learning-Assisted Sound Source Localization From a Flying Drone

Wang

Cavallaro

2022

IEEE Sensors J.

View full text Add to dashboard Cite

Sound source localization from a flying drone is a challenging task due to the strong ego-noise from rotating motors and propellers as well as the movement of the drone and the sound sources. To address this challenge, we propose a deep learning-based framework that integrates single-channel noise reduction and multi-channel source localization. In this framework we suppress the ego-noise and estimate a time-frequency soft ratio mask with a single-channel deep neural network (DNN). Then we design two downstream multi-channel source localization algorithms, based on Steered Response Power (SRP-DNN) and Time-Frequency Spatial filtering (TFS-DNN). The main novelty lies in the proposed TFS-DNN approach, which estimates the presence probability of the target sound at individual time-frequency bins by combining the DNN-inferred soft ratio mask and the instantaneous direction of arrival of the sound received by the microphone array. The time-frequency presence probability of the target sound is then used to design a set of spatial filters to construct a spatial likelihood map for source localization. By jointly exploiting spectral and spatial information, TFS-DNN robustly processes signals in short segments (e.g. 0.5 seconds) in dynamic and low signal-noise-ratio scenarios (e.g. SNR -20 dB). Results on real and simulated data in a variety of scenarios (static sources, moving sources and moving drones) indicate the advantage of TFS-DNN over competing methods, including SRP-DNN and the state-of-the-art time-frequency spatial filtering.

show abstract