2021
DOI: 10.48550/arxiv.2104.05499
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing

Abstract: The L3DAS21 Challenge 1 is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 21 publications
(30 reference statements)
0
4
0
Order By: Relevance
“…This dataset has been used in a number of works to validate the effectiveness of a proposed method on "real-life" recordings, e.g., [63], [69], [72], [111], [129], [211]. Very recently, a SELD challenge focused on 3D sound has been announced [244], where a pair of FOA microphones was used to capture a large number of RIRs in an office room, from which the audio data has been generated.…”
Section: B Real Datamentioning
confidence: 99%
“…This dataset has been used in a number of works to validate the effectiveness of a proposed method on "real-life" recordings, e.g., [63], [69], [72], [111], [129], [211]. Very recently, a SELD challenge focused on 3D sound has been announced [244], where a pair of FOA microphones was used to capture a large number of RIRs in an office room, from which the audio data has been generated.…”
Section: B Real Datamentioning
confidence: 99%
“…More recently, approaches around 3D SE are gaining interest with the signal processing community (GUIZZO; GRAMACCIONI, et al, 2021;MARINONI, et al, 2022). This scenario is more complex because both noises and reverberation must be handled.…”
Section: Review On Speech Enhancement Methodsmentioning
confidence: 99%
“…The Filter and Sum Network (FaSNet) (LUO et al, 2019) is the proposed baseline (GUIZZO; GRAMACCIONI, et al, 2021) for the challenge. The FaSNet is a time-domain neural beamforming with high-performance for low-latency scenarios.…”
Section: Ieee Mlsp 2021 Data Challengementioning
confidence: 99%
See 1 more Smart Citation