A Simple Method to Determine if a Music Information Retrieval System is a “Horse”

Sturm, Bob L.

doi:10.1109/tmm.2014.2330697

Cited by 99 publications

(91 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This differs from applications such as audio archive analysis, for which a system must be robust to signal modifications induced by variation of microphones and preprocessing across the dataset [36]. For embodied machine listening, aspects such as the microphone frequency response will be constant factors rather than random factors.…”

Section: A Requirements Gatheringmentioning

confidence: 99%

Detection and Classification of Acoustic Scenes and Events

Stowell

Giannoulis

Benetos

et al. 2015

IEEE Trans. Multimedia

445

332

View full text Add to dashboard Cite

International audience—For intelligent systems to make best use of the audio modality, it is important that they can recognize not just speech and music, which have been researched as specific tasks, but also general sounds in everyday environments. To stimulate research in this field we conducted a public research challenge: the IEEE Audio and Acoustic Signal Processing Technical Committee challenge on Detection and Classification of Acoustic Scenes and Events (DCASE). In this paper, we report on the state of the art in automatically classifying audio scenes, and automatically detecting and classifying audio events. We survey prior work as well as the state of the art represented by the submissions to the challenge from various research groups. We also provide detail on the organization of the challenge, so that our experience as challenge hosts may be useful to those organizing challenges in similar domains. We created new audio datasets and baseline systems for the challenge; these, as well as some submitted systems, are publicly available under open licenses, to serve as benchmarks for further research in general-purpose machine listening

show abstract

Section: A Requirements Gatheringmentioning

confidence: 99%

Detection and Classification of Acoustic Scenes and Events

Stowell

Giannoulis

Benetos

et al. 2015

IEEE Trans. Multimedia

445

332

View full text Add to dashboard Cite

show abstract

“…This motivates the second contribution of our work: even with very good performance in these "proxy" evaluations, caution must be taken when discussing what these systems have actually learned to do. Even though a model may appear to be doing the right things, it may be working with concepts that are not very general (Sturm, 2014;Sturm and Ben-Tal, 2017). For instance, the folk-rnn models seem to be able to count time and repeat and vary material in ways that are stylistically plausible, but these abilities disappears when the models are pushed even a little outside of its training material.…”

Section: Informing the Research Pursuit Of Machine Learningmentioning

confidence: 99%

Machine learning research that matters for music creation: A case study

Sturm

Ben-Tal

Monaghan

et al. 2018

Journal of New Music Research

Self Cite

View full text Add to dashboard Cite

* Corresponding author 1 Research applying machine learning to music modeling and generation typically proposes model architectures, training methods and datasets, and gauges system performance using quantitative measures like sequence likelihoods and/or qualitative listening tests. Rarely does such work explicitly question and analyse its usefulness for and impact on real-world practitioners, and then build on those outcomes to inform the development and application of machine learning. This article attempts to do these things for machine learning applied to music creation. Together with practitioners, we develop and use several applications of machine learning for music creation, and present a public concert of the results. We reflect on the entire experience to arrive at several ways of advancing these and similar applications of machine learning to music creation.

show abstract

“…temporal or spectral features) to high-level semantic labels using manually pre-labeled training samples [8]- [16]. The task, however, remains challenging due to the following three issues: the scarcity of well-labeled training data [17], [18], the complexity involved in formalizing and evaluating the task while taking care of possible confounds [18], [19], and the difficulty of extracting good audio features that capture the characteristics of each tag [20]- [24]. Good feature design is hard to come by, for example for tags that are social and cultural constructs Manuscript (e.g.…”

Section: Introductionmentioning

confidence: 99%

Music Annotation and Retrieval using Unlabeled Exemplars: Correlation and Sparse Codes

Jao

Yang

2015

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

Tagging music signals with semantic labels such as genres, moods and instruments is important for content-based music retrieval and recommendation. While considerable effort has been made, automatic music annotation is still considered challenging due to the difficulty of extracting good audio features that capture the characteristics of different tags. To address this issue, we present in this letter two exemplar-based approaches that represent the content of a music clip by referring to a large set of unlabeled audio exemplars. The first approach represents a music clip by the set of audio exemplars that is highly correlated with the short-time feature vectors of the clip, whereas the second approach represents a music clip as sparse linear combinations of its short-time feature vectors over the audio exemplars. Music annotation is then performed by learning the relevance of the audio examples to different tags using labeled data. These two approaches effectively capitalize the availability of unlabeled data to explore the commonality of music signals to find out tag-specific acoustic patterns, without domain knowledge and feature design. Evaluation on the CAL10k music genre tagging dataset for tag-based music retrieval shows that, with thousands of unlabeled audio examples randomly drawn from the Million Song Dataset, the proposed approaches lead to remarkably higher precision rates than existing approaches.

show abstract

A Simple Method to Determine if a Music Information Retrieval System is a “Horse”

Cited by 99 publications

References 31 publications

Detection and Classification of Acoustic Scenes and Events

Detection and Classification of Acoustic Scenes and Events

Machine learning research that matters for music creation: A case study

Music Annotation and Retrieval using Unlabeled Exemplars: Correlation and Sparse Codes

Contact Info

Product

Resources

About