2019
DOI: 10.1111/exsy.12429
|View full text |Cite
|
Sign up to set email alerts
|

Music genre recognition using convolutional recurrent neural network architecture

Abstract: The genre is an abstract feature, but still, it is considered to be one of the important characteristics of music. Genre recognition forms an essential component for a large number of commercial music applications. Most of the existing music genre recognition algorithms are based on manual feature extraction techniques. These extracted features are used to develop a classifier model to identify the genre. However, in many cases, it has been observed that a set of features giving excellent accuracy fails to exp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(18 citation statements)
references
References 14 publications
0
16
0
Order By: Relevance
“…They suggested convolutional recurrent neural network (CRNN) models for the music auto-tagging and genre classification. The hybrid nature of neural networks has also been used in [3][5] for music recognition tasks. They used batch normalization on the input feature map, but fail to perform well in the situation where statistics change for various time steps, as mentioned in [12].…”
Section: Results and Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…They suggested convolutional recurrent neural network (CRNN) models for the music auto-tagging and genre classification. The hybrid nature of neural networks has also been used in [3][5] for music recognition tasks. They used batch normalization on the input feature map, but fail to perform well in the situation where statistics change for various time steps, as mentioned in [12].…”
Section: Results and Analysismentioning
confidence: 99%
“…Each of the datasets has some common and distinct features with associated labels. The first dataset is GTZAN which consists of 1000 samples with 10 different genres, each having 100 songs as "blues, classical, country, disco, hip-hop, jazz, metal, pop, reggae, and rock" with a duration of 30 sec each as in [3] [5]. All files are mp3 and encoded with the sample rate of 22050 Hz with a size of 16 bit with the mono channel.…”
Section: A Dataset Descriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…For methods based on representation learning, a bidirectional recurrent neural network with serial attention and parallelized attention is proposed to focus on details of the target area [15]. A CNN-RNN-cascaded deep learning model that uses almost no handcrafted features is proposed in [14]. In [12], a CNN-based architecture with multi-level and multiscale features [11] is extended by transfer learning.…”
Section: ) Gtzan With Artist Filtermentioning
confidence: 99%
“…The scatter transform and the transfer learning have been widely used for image and audio classification [3]- [6] but rarely used for MGR in combination. Various methods for building music genre classifiers have been studied, including support vector machines (SVM) [7]- [9], Gaussian processes [10], convolutional neural network (CNN) [11] [12] [13], recurrent neural network (RNN) [14], and long short-term memory (LSTM) [15]. It is proven that CNNs learning representation features yield better performance in comparison to LSTMs and traditional methods which extract handcrafted features [16].…”
Section: Introductionmentioning
confidence: 99%