2019
DOI: 10.1109/taslp.2019.2895254
|View full text |Cite
|
Sign up to set email alerts
|

Sound Event Detection and Time–Frequency Segmentation from Weakly Labelled Data

Abstract: Sound event detection (SED) aims to detect when and recognize what sound events happen in an audio clip. Many supervised SED algorithms rely on strongly labelled data which contains the onset and offset annotations of sound events. However, many audio tagging datasets are weakly labelled, that is, only the presence of the sound events is known, without knowing their onset and offset annotations. In this paper, we propose a time-frequency (T-F) segmentation framework trained on weakly labelled data to tackle th… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
91
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 97 publications
(91 citation statements)
references
References 44 publications
0
91
0
Order By: Relevance
“…Lms128 40 20 128 Fbank64 25 10 64 Regarding the AED features, 128 dimensional logmelspectra [14,15,16,17] were extracted. Here, a single frame is extracted every 20ms with a window size of 40ms (Table 4).…”
Section: Featurename Window Shift Dimensionmentioning
confidence: 99%
“…Lms128 40 20 128 Fbank64 25 10 64 Regarding the AED features, 128 dimensional logmelspectra [14,15,16,17] were extracted. Here, a single frame is extracted every 20ms with a window size of 40ms (Table 4).…”
Section: Featurename Window Shift Dimensionmentioning
confidence: 99%
“…To evaluate the results of audio tagging, we follow the metrics proposed in [17]. The results are evaluated by precision, recall, F-score [19] and Area Under Curve (AUC) [20].…”
Section: Dataset Experiments Setup and Evaluation Metricsmentioning
confidence: 99%
“…Humans have an inherent ability to match sound events based on acoustic similarity and the relationship between them [1]. Previous studies mainly focus on sound event detection (SED), investigating which sound events happen in an audio recording and when they occur [2]. In contrast, Sound event retrieval (SER) is retrieving audio recordings that are similar to a given input audio query [3,4].…”
Section: Introductionmentioning
confidence: 99%