ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9414774
|View full text |Cite
|
Sign up to set email alerts
|

What’s all the Fuss about Free Universal Sound Separation Data?

Abstract: We introduce the Free Universal Sound Separation (FUSS) dataset, a new corpus for experiments in separating mixtures of an unknown number of sounds from an open domain of sound types. The dataset consists of 23 hours of single-source audio data drawn from 357 classes, which are used to create mixtures of one to four sources. To simulate reverberation, an acoustic room simulator is used to generate impulse responses of box-shaped rooms with frequencydependent reflective walls. Additional open-source data augmen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
41
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 44 publications
(41 citation statements)
references
References 22 publications
0
41
0
Order By: Relevance
“…We train separation networks using the same architecture as previous works [6,8,9,10], which separates sources by masking in a learned transform domain. The network is composed of a learnable encoder/decoder with 2.5 ms window and 1.25 ms hop, com-bined with a time-domain convolutional network (TDCN++).…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…We train separation networks using the same architecture as previous works [6,8,9,10], which separates sources by masking in a learned transform domain. The network is composed of a learnable encoder/decoder with 2.5 ms window and 1.25 ms hop, com-bined with a time-domain convolutional network (TDCN++).…”
Section: Methodsmentioning
confidence: 99%
“…Separation performance is evaluated using several supervised synthetic datasets containing reference sources. Since we train our separation models on a universal sound separation task, we primarily focus on evaluation with the FUSS dataset [9], which contains 10second mixtures of one to four arbitrary sound sources drawn from Since many practical separation applications focus on speech signals, we additionally evaluate how well unsupervised universal separation models can generalize to two specific task domains: speech separation, using mixtures of two overlapping speakers from Libr2Mix [22], and speech enhancement, using the same dataset as previous work [8] in which speech is drawn from librivox.org, and nonspeech from freesound.org.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…These include AEs such as human sounds, object sounds, musical instruments, etc [10]. We generated two test sets using the FSD-Kaggle and a subset of the FSD50K data that consists of the AE sound samples from a single AE class provided in the FUSS dataset [24]. We used FSD50K to generate data with new AE classes unseen in the FSD-Kaggle training set.…”
Section: Datasetmentioning
confidence: 99%