2019
DOI: 10.1121/1.5111059
|View full text |Cite
|
Sign up to set email alerts
|

General audio tagging with ensembling convolutional neural networks and statistical features

Abstract: Audio tagging aims to infer descriptive labels from audio clips. Audio tagging is challenging due to the limited size of data and noisy labels. In this paper, we describe our solution for the DCASE 2018 Task 2 general audio tagging challenge. The contributions of our solution include: We investigated a variety of convolutional neural network architectures to solve the audio tagging task. Statistical features are applied to capture statistical patterns of audio features to improve the classification performance… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 23 publications
(11 citation statements)
references
References 21 publications
0
11
0
Order By: Relevance
“…Yan Xiong Li et al [22] used the BLSTM Network on Acoustic Scenes to get a better result. Kele Xu et al [23] purpose a novel ensemble-learning system consists of CNN that gets a superior classification performance.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Yan Xiong Li et al [22] used the BLSTM Network on Acoustic Scenes to get a better result. Kele Xu et al [23] purpose a novel ensemble-learning system consists of CNN that gets a superior classification performance.…”
Section: Related Workmentioning
confidence: 99%
“…Most of the author [20,21,23] used CNN that performs well on image classification also RNN [24] performs well on a series of data. Image data gives the data as a 2D array that consists of the pixel value.…”
Section: Related Workmentioning
confidence: 99%
“…Subsequent use of this level of anxiety to adapt to the scenario of VR exposure therapy can be facilitated by classifying it into several intensity categories according to an evaluation scale. Among the algorithms that obtained a high classification accuracy are the classical machine learning methods and the convolutional neural networks (CNNs) [90,91].…”
Section: Classificationmentioning
confidence: 99%
“…After the feature extraction component, the relationship between extracted features and the popularity score can be built using many off-the-shelf regression models, which including Support Vector Machine (SVM), Random Forest [19] and Gradient boosting trees [6,43]. In our framework, we employ deep neural network as the regression model, by using the features extracted from different modalities.…”
Section: Neural Network-based Regression Model With Attention Mechanismmentioning
confidence: 99%