2021
DOI: 10.1109/tetci.2020.3014934
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning Assisted Time-Frequency Processing for Speech Enhancement on Drones

Abstract: This article fills the gap between the growing interest in signal processing based on Deep Neural Networks (DNN) and the new application of enhancing speech captured by microphones on a drone. In this context, the quality of the target sound is degraded significantly by the strong ego-noise from the rotating motors and propellers. We present the first work that integrates single-channel and multi-channel DNN-based approaches for speech enhancement on drones. We employ a DNN to estimate the ideal ratio masks at… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 27 publications
(28 citation statements)
references
References 54 publications
0
25
0
Order By: Relevance
“…Wiener filter [15]- [17] Kalman filter [18], [19] Neural network [20]- [22] Evolutionary algorithm [23], [24] Non-adaptive Low-pass/high-pass/pass-band filter [25] Wavelet transform [26]- [28] Beam-forming [29] Singular value decomposition Independent component analysis [30] Principal component analysis [31], [32] Singular spectrum analysis [33]- [35] Other [36]- [38] Moreover, the ego-noise is highly non-stationary, as it typically depends on the characteristics of the movements being performed, e.g., speeds and accelerations. The noise produced by a drone has three main components, namely, the mechanical noise generated by the rotation of the motors, the noise generated by the propellers cutting through the air, and the noise of the airflow generated by the propellers themselves.…”
Section: Adaptivementioning
confidence: 99%
See 1 more Smart Citation
“…Wiener filter [15]- [17] Kalman filter [18], [19] Neural network [20]- [22] Evolutionary algorithm [23], [24] Non-adaptive Low-pass/high-pass/pass-band filter [25] Wavelet transform [26]- [28] Beam-forming [29] Singular value decomposition Independent component analysis [30] Principal component analysis [31], [32] Singular spectrum analysis [33]- [35] Other [36]- [38] Moreover, the ego-noise is highly non-stationary, as it typically depends on the characteristics of the movements being performed, e.g., speeds and accelerations. The noise produced by a drone has three main components, namely, the mechanical noise generated by the rotation of the motors, the noise generated by the propellers cutting through the air, and the noise of the airflow generated by the propellers themselves.…”
Section: Adaptivementioning
confidence: 99%
“…Neural network-based approaches can achieve state-ofthe-art results in application domains with great amounts of data. RNNoise [20], put forward by Zhang et al [21], and the Wang et al method [22] are both great alternatives in the voice denoising field. However, they are not as performant in domains with much less data.…”
Section: B Software-based Noise Reductionmentioning
confidence: 99%
“…Additionally, the recent LA work of [ 76 ] does not aim to localize either, but to enhance the signal being captured from the ground via a technique based on deep learning. This work can have important implications for further localization or detection from a UAV.…”
Section: Auditory Perception From a Uav (Land To Air)mentioning
confidence: 99%
“…Mavic Pro [74] Quad [75] Bebop 2 [16] Matrice 100 [16] Quad [76] Quad [77] * Phantom I [78] Quad [79] Quad [80] Hexa [81] Overall, there are some interesting tendencies that are worth pointing out at this moment. In Figure 5, we present a distribution of works revised in this review across recent years and differentiated by their category, i.e., AL, LA, and AA.…”
mentioning
confidence: 98%
“…This problem can be addressed using classical signal processing noise reduction algorithms, including frequencyspatial filtering techniques, effective in blind source separation problems [16]. More recently, methods based on Deep Neural Networks (DNN) [17] are also being used to enhance speech signals captured using drones. An example is the work presented in [17] that integrates single-and multi-channel DNN-based approaches for the enhancement of speech signals captured from drones.…”
Section: Introductionmentioning
confidence: 99%