Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification

Panagakis, Yannis; Kotropoulos, Constantine; Arce, Gonzalo R.

doi:10.1109/tasl.2009.2036813

Cited by 95 publications

(59 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…STFT involves extracting several frames from the signals and analyzing them using a time-sliding window such that the relation between the variation of the frequency and the time can be identified. We used STFT to transform the original signals into a spatial-spectraltemporal domain as high-dimensional third-order tensors [17]. Given a signal that varies over time, STFT was used to determine the sinusoidal frequency and phase the content of the local sections.…”

Section: Short-time Fourier Transformmentioning

confidence: 99%

Cardiology knowledge free ECG feature extraction using generalized tensor rank one discriminant analysis

Huang

Zhang

2014

EURASIP J. Adv. Signal Process.

View full text Add to dashboard Cite

Applications based on electrocardiogram (ECG) signal feature extraction and classification are of major importance to the autodiagnosis of heart diseases. Most studies on ECG classification methods have targeted only 1-or 2-lead ECG signals. This limitation results from the unavailability of real clinical 12-lead ECG data, which would help train the classification models. In this study, we propose a new tensor-based scheme, which is motivated by the lack of effective feature extraction methods for direct tensor data input. In this scheme, an ECG signal is represented by third-order tensors in the spatial-spectral-temporal domain after using short-time Fourier transform on the raw ECG data. To overcome the limitations of tensor rank one discriminant analysis (TR1DA) inherited from linear discriminant analysis, we introduced a generalized tensor rank one discriminant analysis (GTR1DA). This approach involves considering the distribution of the data points near the classification boundary to calculate better projection tensors. The experimental results showed that the proposed method achieves greater classification accuracy than other vector-and tensor-based methods. Finally, GTR1DA features a better convergence property than the original TR1DA.

show abstract

Section: Short-time Fourier Transformmentioning

confidence: 99%

Cardiology knowledge free ECG feature extraction using generalized tensor rank one discriminant analysis

Huang

Zhang

2014

EURASIP J. Adv. Signal Process.

View full text Add to dashboard Cite

show abstract

“…This point of view, however, is not evident in much of the MGR literature, e.g., the three reviews devoted specifically to MGR (Aucouturier and Pachet 2003;Scaringella et al 2006;Fu et al 2011), the work of Tzanetakis and Cook (2002), Barbedo and Lopes (2008), Bergstra et al (2006a), Holzapfel and Stylianou (2008), Marques et al (2011b), Panagakis et al (2010a), Benetos and Kotropoulos (2010), and so on. It is thus not idiosyncratic to claim that one purpose of MGR could be to identify, discriminate between, and learn the criteria of music genres in order to produce genre labels that are indistinguishable from those humans would produce.…”

Section: Argumentsmentioning

confidence: 99%

Classification accuracy is not enough

Sturm

2013

J Intell Inf Syst

View full text Add to dashboard Cite

We argue that an evaluation of system behavior at the level of the music is required to usefully address the fundamental problems of music genre recognition (MGR), and indeed other tasks of music information retrieval, such as autotagging. A recent review of works in MGR since 1995 shows that most (82 %) measure the capacity of a system to recognize genre by its classification accuracy. After reviewing evaluation in MGR, we show that neither classification accuracy, nor recall and precision, nor confusion tables, necessarily reflect the capacity of a system to recognize genre in musical signals. Hence, such figures of merit cannot be used to reliably rank, promote or discount the genre recognition performance of MGR systems if genre recognition (rather than identification by irrelevant confounding factors) is the objective. This motivates the development of a richer experimental toolbox for evaluating any system designed to intelligently extract information from music signals.

show abstract

“…Although much more elaborated music representations have been proposed in the literature, the just mentioned features perform quite well in practice [14,[22][23][24]. Most importantly, song-level representations are suitable for large-scale music classification problems since the space complexity for audio processing and analysis is reduced and the database overflow is prevented [3].…”

Section: Audio Feature Extractionmentioning

confidence: 99%

“…First, to be able to compare the performance of the LRSMs with that of the state-of-theart music classification methods, standard evaluation protocols were applied to the seven datasets. In particular, following [16,17,20,22,56,57], stratified 10-fold crossvalidation was applied to the GTZAN dataset. According to [15,16,54], the same protocol was also applied to the Homburg, Unique, 1517-Artists, and MTV datasets.…”

Section: Datasets and Evaluation Proceduresmentioning

confidence: 99%

“…Such features include timbral texture features, rhythmic features, pitch content, or their combinations, yielding a bag-of-features (BOF) representation [1,2,[6][7][8][9][10][11][12][13][14][15][16][17][18]. Furthermore, spectral, cepstral, and auditory modulationbased features have been recently employed either in BOF approaches or as autonomous music representations in order to capture both the timbral and the temporal struc-http://asmp.eurasipjournals.com/content/2013/1/13 ture of music [19][20][21][22]. At the machine learning stage, music genre and mood classification are treated as singlelabel multi-class classification problems.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Music classification by low-rank semantic mappings

Panagakis

Kotropoulos

2013

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

A challenging open question in music classification is which music representation (i.e., audio features) and which machine learning algorithm is appropriate for a specific music classification task. To address this challenge, given a number of audio feature vectors for each training music recording that capture the different aspects of music (i.e., timbre, harmony, etc.), the goal is to find a set of linear mappings from several feature spaces to the semantic space spanned by the class indicator vectors. These mappings should reveal the common latent variables, which characterize a given set of classes and simultaneously define a multi-class linear classifier that classifies the extracted latent common features. Such a set of mappings is obtained, building on the notion of the maximum margin matrix factorization, by minimizing a weighted sum of nuclear norms. Since the nuclear norm imposes rank constraints to the learnt mappings, the proposed method is referred to as low-rank semantic mappings (LRSMs). The performance of the LRSMs in music genre, mood, and multi-label classification is assessed by conducting extensive experiments on seven manually annotated benchmark datasets. The reported experimental results demonstrate the superiority of the LRSMs over the classifiers that are compared to. Furthermore, the best reported classification results are comparable with or slightly superior to those obtained by the state-of-the-art task-specific music classification methods.

show abstract

Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification

Cited by 95 publications

References 32 publications

Cardiology knowledge free ECG feature extraction using generalized tensor rank one discriminant analysis

Cardiology knowledge free ECG feature extraction using generalized tensor rank one discriminant analysis

Classification accuracy is not enough

Music classification by low-rank semantic mappings

Contact Info

Product

Resources

About