Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2169
|View full text |Cite
|
Sign up to set email alerts
|

Towards Joint Sound Scene and Polyphonic Sound Event Recognition

Abstract: Acoustic Scene Classification (ASC) and Sound Event Detection (SED) are two separate tasks in the field of computational sound scene analysis. In this work, we present a new dataset with both sound scene and sound event labels and use this to demonstrate a novel method for jointly classifying sound scenes and recognizing sound events. We show that by taking a joint approach, learning is more efficient and whilst improvements are still needed for sound event detection, SED results are robust in a dataset where … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 30 publications
(35 citation statements)
references
References 15 publications
0
35
0
Order By: Relevance
“…Conversely, multi-label classification is the prediction of up to N labels for each test sample [10]. This has been used before with joint-task work (see [11]) which strived to undertake ASC jointly with SED 2DConv (32,(7,7)) Batch Norm MaxPool2D((5,5)) Dropout(0.3) 2DConv (64,(7,7)) Batch Norm MaxPool2D((5,5)) Dropout(0.3)…”
Section: Methodsmentioning
confidence: 99%
“…Conversely, multi-label classification is the prediction of up to N labels for each test sample [10]. This has been used before with joint-task work (see [11]) which strived to undertake ASC jointly with SED 2DConv (32,(7,7)) Batch Norm MaxPool2D((5,5)) Dropout(0.3) 2DConv (64,(7,7)) Batch Norm MaxPool2D((5,5)) Dropout(0.3)…”
Section: Methodsmentioning
confidence: 99%
“…A natural approach is to train one model to perform ASC and AED in a joint manner [76] as acoustic events are the building blocks of acoustic scenes. Sound events and acoustic scenes naturally follow a hierarchical relationship.…”
Section: Multitask Learningmentioning
confidence: 99%
“…However, some sound events and scenes are closely related, and the knowledge of sound events and scenes can help in estimating them mutually. Considering this idea, joint analysis of sound events and acoustic scenes based on MTL of SED and ASC has been proposed [18,19]. As shown in Fig.…”
Section: Joint Analysis Of Sound Events and Scenes Based On Multitaskmentioning
confidence: 99%