TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON) 2019
DOI: 10.1109/tencon.2019.8929406
|View full text |Cite
|
Sign up to set email alerts
|

Music Genre Recognition Using Residual Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
9
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 12 publications
0
9
0
Order By: Relevance
“…Deep learning solutions have been employed by researchers for music genre recognition and music recommender systems. Using the GTZAN dataset, the author [17] created a residual neural network to train on audio snippets of 3 seconds duration. Certain qualities that overlapped were taken into account for various genres, and the author was able to obtain a 94% accuracy rate.…”
Section: Related Studymentioning
confidence: 99%
“…Deep learning solutions have been employed by researchers for music genre recognition and music recommender systems. Using the GTZAN dataset, the author [17] created a residual neural network to train on audio snippets of 3 seconds duration. Certain qualities that overlapped were taken into account for various genres, and the author was able to obtain a 94% accuracy rate.…”
Section: Related Studymentioning
confidence: 99%
“…As shown in The SOTA methods GTZAN dataset (%) Bisharad et al [7] 85.36 Bisharad et al [8] 82.00 Raissi et al [42] 91.00 Sugianto et al [45] 71.87 Ashraf et al [3] 87.79 Ng et al [39] (FusionNet) 96.50 Liu et al [30] 93.90 Nanni et al [37] 90.60 Ours (MS-SincResNet) 91.49…”
Section: Ablation Studymentioning
confidence: 99%
“…In recent years, with the remarkable success of deep learning techniques in computer vision applications, deep neural networks (DNNs) have also shown great success in speech/music classification or recognition tasks, such as speaker recognition [36,43], music genre classification [6,39], speech emotion recognition [49], etc. In these tasks, deep learning provides a new way to extract discriminative embeddings from those famous hand-crafted acoustic features, called i-vector content, for classification/recognition [8]. Specifically, ResNet-18 is used to extract time-frequency features from the Melspectrogram of each 3-second music clip.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…al. utilized the complex Residual Neural Network (RNN) models on 3second intervals from the GTZAN dataset to achieve a genre classification accuracy of 94% [4].…”
Section: Introductionmentioning
confidence: 99%