2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019
DOI: 10.1109/iros40897.2019.8968183
|View full text |Cite
|
Sign up to set email alerts
|

Audio-visual sensing from a quadcopter: dataset and baselines for source localization and sound enhancement

Abstract: We present an audio-visual dataset recorded outdoors from a quadcopter and discuss baseline results for multiple applications. The dataset includes a scenario for source localization and sound enhancement with up to two static sources, and a scenario for source localization and tracking with a moving sound source. These sensing tasks are made challenging by the strong and time-varying ego-noise generated by the rotating motors and propellers. The dataset was collected using a small circular array with 8 microp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2

Relationship

3
4

Authors

Journals

citations
Cited by 17 publications
(20 citation statements)
references
References 27 publications
0
20
0
Order By: Relevance
“…We use two multi-channel drone-sound datasets, AS [8] and AVQ [51], and a single-channel speech and noise dataset. Table I summarizes the details of the noise data used in this paper.…”
Section: A Experiments Setupmentioning
confidence: 99%
“…We use two multi-channel drone-sound datasets, AS [8] and AVQ [51], and a single-channel speech and noise dataset. Table I summarizes the details of the noise data used in this paper.…”
Section: A Experiments Setupmentioning
confidence: 99%
“…First, TFS requires the knowledge of the target sound direction and the location of the microphones to estimate the DOA of the sound at each time-frequency bin and calculate the correlation matrix of the target sound. Recently, several sound source localization algorithms were proposed for the drone platform [14], [15], [41]- [45]. Second, TFS is sensitive to the direction of the target sound.…”
Section: A Ego-noise Reductionmentioning
confidence: 99%
“…TFS requires the knowledge of the DOA of the target sound, which can be estimated with sound source localization algorithms [15], [44], or with an onboard camera and an object detector [14], [43]. While the proposed method considers one source only, it can be extended to multiple sources as long as their DOAs are known.…”
Section: G Remarksmentioning
confidence: 99%
See 2 more Smart Citations