Music Genre Recognition Using Residual Neural Networks

Bisharad, Dipjyoti; Laskar, Rabul Hussain

doi:10.1109/tencon.2019.8929406

Cited by 16 publications

(11 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Deep learning solutions have been employed by researchers for music genre recognition and music recommender systems. Using the GTZAN dataset, the author [17] created a residual neural network to train on audio snippets of 3 seconds duration. Certain qualities that overlapped were taken into account for various genres, and the author was able to obtain a 94% accuracy rate.…”

Section: Related Studymentioning

confidence: 99%

Machine Learning-Based Music Genre Classification with Pre-Processed Feature Analysis

Islam¹,

Hasan²,

Rahim

et al. 2022

Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika

View full text Add to dashboard Cite

The growth of the entertainment industry around the world may be seen in the creation of new genres and the influx of artists and musicians into this field. Every day, a large amount of music is generated and released. The classification of this music based on genres and the recommendation of music to users is a crucial task for various music streaming platforms. Many artificial intelligence methods have been created to overcome this. Inadequate data for training is one of the biggest issues when it comes to building machine learning algorithms. A certain dataset contains a large number of redundant features, which may lead the models to overfit. Data filtering could be used to solve this issue. On the GTZAN data for music genre classification, this article constructed numerous Artificial Intelligence (AI) models and used a data filtering strategy. This study does a comparative analysis and discusses the results. The models developed and evaluated are Naive Bayes

show abstract

Section: Related Studymentioning

confidence: 99%

Machine Learning-Based Music Genre Classification with Pre-Processed Feature Analysis

Islam¹,

Hasan²,

Rahim

et al. 2022

Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika

View full text Add to dashboard Cite

show abstract

“…As shown in The SOTA methods GTZAN dataset (%) Bisharad et al [7] 85.36 Bisharad et al [8] 82.00 Raissi et al [42] 91.00 Sugianto et al [45] 71.87 Ashraf et al [3] 87.79 Ng et al [39] (FusionNet) 96.50 Liu et al [30] 93.90 Nanni et al [37] 90.60 Ours (MS-SincResNet) 91.49…”

Section: Ablation Studymentioning

confidence: 99%

“…In recent years, with the remarkable success of deep learning techniques in computer vision applications, deep neural networks (DNNs) have also shown great success in speech/music classification or recognition tasks, such as speaker recognition [36,43], music genre classification [6,39], speech emotion recognition [49], etc. In these tasks, deep learning provides a new way to extract discriminative embeddings from those famous hand-crafted acoustic features, called i-vector content, for classification/recognition [8]. Specifically, ResNet-18 is used to extract time-frequency features from the Melspectrogram of each 3-second music clip.…”

Section: Introductionmentioning

confidence: 99%

“…To this end, deep learning methods based on convolutional neural networks (CNNs) are the most widely used approach to obtain embeddings from those i-vector content, such as MFCC [46,47,53], OSC coefficients [54], 2D representations like audio spectrogram or chromagram [6,39], etc. Bisharad et al proposed a music genre classification system using residual neural network (ResNet) based model [8]. Specifically, ResNet-18 is used to extract time-frequency features from the Melspectrogram of each 3-second music clip.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

MS-SincResNet: Joint Learning of 1D and 2D Kernels Using Multi-scale SincNet and ResNet for Music Genre Classification

Chang

Chen

Lee

2021

Proceedings of the 2021 International Conference on Multimedia Retrieval

View full text Add to dashboard Cite

In this study, we proposed a new end-to-end convolutional neural network, called MS-SincResNet, for music genre classification. MS-SincResNet appends 1D multi-scale SincNet (MS-SincNet) to 2D ResNet as the first convolutional layer in an attempt to jointly learn 1D kernels and 2D kernels during the training stage. First, an input music signal is divided into a number of fixed-duration (3 seconds in this study) music clips, and the raw waveform of each music clip is fed into 1D MS-SincNet filter learning module to obtain three-channel 2D representations. The learned representations carry rich timbral, harmonic, and percussive characteristics comparing with spectrograms, harmonic spectrograms, percussive spectrograms and Mel-spectrograms. ResNet is then used to extract discriminative embeddings from these 2D representations. The spatial pyramid pooling (SPP) module is further used to enhance the feature discriminability, in terms of both time and frequency aspects, to obtain the classification label of each music clip. Finally, the voting strategy is applied to summarize the classification results from all 3-second music clips. In our experimental results, we demonstrate that the proposed MS-SincResNet outperforms the baseline SincNet and many well-known hand-crafted features. Considering individual 2D representation, MS-SincResNet also yields competitive results with the state-of-the-art methods on the GTZAN dataset and the ISMIR2004 dataset. The code is

show abstract

“…al. utilized the complex Residual Neural Network (RNN) models on 3second intervals from the GTZAN dataset to achieve a genre classification accuracy of 94% [4].…”

Section: Introductionmentioning

confidence: 99%

Deep Learning Techniques for Music Genre Classification and Feature Importance

Srivastava¹

2022

Preprint

View full text Add to dashboard Cite

<p>Utilized GTZAN dataset which consisted of features extracted at 3-second and .3-second intervals and includes features such as mel-frequency cepstrum, tempo, and harmony. Additionally, I used the raw audio signals and developed a 1D CNN for the classification of music genres. Experimented between different default and complex machine learning models for the music genre classification task on the GTZAN dataset as a whole.</p>

show abstract

Music Genre Recognition Using Residual Neural Networks

Cited by 16 publications

References 12 publications

Machine Learning-Based Music Genre Classification with Pre-Processed Feature Analysis

Machine Learning-Based Music Genre Classification with Pre-Processed Feature Analysis

MS-SincResNet: Joint Learning of 1D and 2D Kernels Using Multi-scale SincNet and ResNet for Music Genre Classification

Deep Learning Techniques for Music Genre Classification and Feature Importance

Contact Info

Product

Resources

About