Proceedings of the Detection and Classification of Acoustic Scenes And Events 2019 Workshop (DCASE2019) 2019
DOI: 10.33682/57xx-t679
|View full text |Cite
|
Sign up to set email alerts
|

Convolutional Recurrent Neural Network and Data Augmentation for Audio Tagging with Noisy Labels and Minimal Supervision

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 12 publications
(15 citation statements)
references
References 0 publications
1
14
0
Order By: Relevance
“…This is supported by the outcome of the contest-like comparison as well as the significance testing. This result reflects recent developments in the DCASE community, in which virtually all competition systems contain convolutional stages [21], [22], [25], [26], [28], [43]- [46]. However the paralinguistics community still relies primarily on pure recurrent networks [8], [9], [18], [19].…”
Section: Discussionsupporting
confidence: 64%
See 3 more Smart Citations
“…This is supported by the outcome of the contest-like comparison as well as the significance testing. This result reflects recent developments in the DCASE community, in which virtually all competition systems contain convolutional stages [21], [22], [25], [26], [28], [43]- [46]. However the paralinguistics community still relies primarily on pure recurrent networks [8], [9], [18], [19].…”
Section: Discussionsupporting
confidence: 64%
“…We chose this representation type because (1) it allows to use neural networks with two-dimensional inputs such as CNNs adapted from computer vision which currently prevail in audio recognition [32], and (2) it is currently the most prevalent spectrogram derivate for audio recognition, i.e. in the DCASE community [22], [24]- [28], [32] as well as paralinguistics [9], [16], [33].…”
Section: B Data Representationmentioning
confidence: 99%
See 2 more Smart Citations
“…During training various data augmentation techniques are used, namely random scaling, mixup, frequency warping, blurring, time masking, frequency masking and random noise. Random scaling and mixup [18] are performed on the waveform similar as in [19] by shifting and superposing signals as follows:…”
Section: Feature Extraction and Data Augmentationmentioning
confidence: 99%