2017
DOI: 10.1109/taslp.2017.2690561
|View full text |Cite
|
Sign up to set email alerts
|

Supervised Representation Learning for Audio Scene Classification

Abstract: Abstract-This paper investigates the use of supervised feature learning approaches for extracting relevant and discriminative features from acoustic scene recordings. Owing to the recent release of open datasets for acoustic scene classification (ASC) problems, representation learning techniques can now be envisioned for solving the problem of feature extraction. This paper makes a step towards this goal by first studying models based on convolutional neural networks (ConvNet). Because the scale of the dataset… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 33 publications
(24 citation statements)
references
References 42 publications
0
24
0
Order By: Relevance
“…The deep learning performance can vary significantly with different feature representation and architecture and a large amount of data is required to train the feature learners [25]. In [9], [26] and [27], it has been demonstrated that the acoustic scenes can be learned from a TFR using matrix factorization techniques. Matrix factorization methods using Principal Component Analysis (PCA) and Nonnegative Matrix Factorization (NMF) have been explored with different variants and tuning strategies to further improve the classification accuracy.…”
Section: Previous Workmentioning
confidence: 99%
“…The deep learning performance can vary significantly with different feature representation and architecture and a large amount of data is required to train the feature learners [25]. In [9], [26] and [27], it has been demonstrated that the acoustic scenes can be learned from a TFR using matrix factorization techniques. Matrix factorization methods using Principal Component Analysis (PCA) and Nonnegative Matrix Factorization (NMF) have been explored with different variants and tuning strategies to further improve the classification accuracy.…”
Section: Previous Workmentioning
confidence: 99%
“…Therefore, the first step of an ASC system is either to compute a time-frequency representation or hand-crafted features in order to work with more compact and interpretable data. Time frequency representations are mostly used as inputs of matrix factorization or CNN systems [12,13,16]. They are usually based on perceptually motivated time-frequency representations with log-scaled frequency bands such as constant-Q transforms (CQT) or Mel spectrograms.…”
Section: Representations For Ascmentioning
confidence: 99%
“…The use of other divergences have not shown to provide any notable increase in performance for the task [12] while augmenting the computation time. Supervised NMF models have been applied to ASC with the goal of taking into account the knowledge about the class labels in order to learn better decompositions [12,13,20]. We choose to use the Task-driven NMF (TNMF) formulation [12], a supervised NMF approach adapted from the Taskdriven dictionary learning framework [21].…”
Section: Suppose We Have a Nonnegative Data Matrixmentioning
confidence: 99%
See 2 more Smart Citations