2021
DOI: 10.1587/transinf.2020edp7036
|View full text |Cite
|
Sign up to set email alerts
|

Joint Analysis of Sound Events and Acoustic Scenes Using Multitask Learning

Abstract: Sound event detection (SED) and acoustic scene classification (ASC) are important research topics in environmental sound analysis. Many research groups have addressed SED and ASC using neural-network-based methods, such as the convolutional neural network (CNN), recurrent neural network (RNN), and convolutional recurrent neural network (CRNN). The conventional methods address SED and ASC separately even though sound events and acoustic scenes are closely related to each other. For example, in the acoustic scen… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 16 publications
(33 citation statements)
references
References 31 publications
(35 reference statements)
0
22
0
Order By: Relevance
“…Most conventional works address ASC and SED separately; however, many acoustic scenes and sound events are related mutually. Considering that knowledge of acoustic scenes and sound events can mutually aid in their estimation, the joint analysis of acoustic scenes and sound events based on multitask learning has been proposed [18,19,23].…”
Section: Joint Analysis Of Acoustic Scenes and Sound Events Based On ...mentioning
confidence: 99%
See 1 more Smart Citation
“…Most conventional works address ASC and SED separately; however, many acoustic scenes and sound events are related mutually. Considering that knowledge of acoustic scenes and sound events can mutually aid in their estimation, the joint analysis of acoustic scenes and sound events based on multitask learning has been proposed [18,19,23].…”
Section: Joint Analysis Of Acoustic Scenes and Sound Events Based On ...mentioning
confidence: 99%
“…Imoto and co-workers proposed ASC methods based on Bayesian generative models, in which information on sound events is considered [16,17]. Bear et al [18], Tonami et al [19], and Jung et al [20] presented methods of jointly analyzing acoustic scenes and sound events based on the multitask learning (MTL) of ASC and SED. These works have revealed that utilizing the relationship between acoustic scenes and sound events improves the performance of each downstream task.…”
Section: Introductionmentioning
confidence: 99%
“…We thus manually annotated the audio clips with sound event labels by the procedure described in [24,25]. The resulting audio clips contained 25 types of sound event label [17]. The event label annotations for our experiments are available in [26].…”
Section: Experiments 41 Experimental Conditionsmentioning
confidence: 99%
“…Furthermore, some studies have revealed that the contexts of scenes (e.g., "home," "office," and "cooking"), which are defined by locations, activities, and time, help increase the accuracy of SED [11][12][13][14][15][16][17][18][19]. For example, Heittola et al [12] have proposed a cascade method for SED using results of acoustic scene classification (ASC).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation