ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9053839
|View full text |Cite
|
Sign up to set email alerts
|

Learning a Representation for Cover Song Identification Using Convolutional Neural Network

Abstract: Cover song identification represents a challenging task in the field of Music Information Retrieval (MIR) due to complex musical variations between query tracks and cover versions. Previous works typically utilize hand-crafted features and alignment algorithms for the task. More recently, further breakthroughs are achieved employing neural network approaches. In this paper, we propose a novel Convolutional Neural Network (CNN) architecture based on the characteristics of the cover song task. We first train the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
27
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 17 publications
(32 citation statements)
references
References 11 publications
(23 reference statements)
0
27
0
Order By: Relevance
“…Our proposed network employs a feature extraction module to convert an input CQT spectrogram of a track to a fixed length embedding, similar to [14]. PiCKINet, outlined graphically in Fig.…”
Section: Pickinetmentioning
confidence: 99%
See 3 more Smart Citations
“…Our proposed network employs a feature extraction module to convert an input CQT spectrogram of a track to a fixed length embedding, similar to [14]. PiCKINet, outlined graphically in Fig.…”
Section: Pickinetmentioning
confidence: 99%
“…Deep Learning (DL) methods have recently become the focus of CSI research, with improved performance recorded. While some DL-CSI methods have employed CCMs [8,9], embedding based approaches are more popular [10][11][12][13][14][15][16]. Pro loss functions [12,13,15], or classification based learning with the embedding extracted from the penultimate network layer at test time [10,11,14,16].…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Roughly speaking, existing deep learning methods to CSI can be classified into two categories. The first category of methods, e.g., [8][9][10], treats CSI as a multi-class classification problem, where each version group is considered as an unique class. Convolutional neural networks (CNNs) are trained to classify music tracks in the training set and during retrieval, the network's penultimate layer is used to generate feature for audio matching.…”
Section: Introductionmentioning
confidence: 99%