Hendrik Schreiber scite author profile

With the advent of deep learning, global tempo estimation accuracy has reached a new peak, which presents a great opportunity to evaluate our evaluation practices. In this article, we discuss presumed and actual applications, the pros and cons of commonly used metrics, and the suitability of popular datasets. To guide future research, we present results of a survey among domain experts that investigates today's applications, their requirements, and the usefulness of currently employed metrics. To aid future evaluations, we present a public repository containing evaluation code as well as estimates by many different systems and different ground truths for popular datasets.

show abstract

Exploiting global features for tempo octave correction

Schreiber¹,

Müller

2014

View full text Add to dashboard Cite

Tempo estimation is a fundamental problem in music information retrieval. Most approaches attempt to solve two problems: first finding a dominant pulse and second correcting the metrical level of this pulse. The latter has also been dubbed fixing the octave error. We propose an algorithm for tempo estimation that addresses both problems mostly independently. While using a standard pulse detection technique, for octave error correction, we exploit a simple relationship between a single global feature, average spectral novelty, and listener perception of musical tempo. The proposed method is extremely simple. Nevertheless, it outperforms most existing tempo estimation methods and is on par with the best-performing ones. It thus exemplifies that a global feature-based approach can significantly improve tempo estimation.

show abstract

Accelerating Index-Based Audio Identification

Schreiber¹,

Müller

2014

IEEE Trans. Multimedia

View full text Add to dashboard Cite

Musical Tempo and Key Estimation using Convolutional Neural Networks with Directional Filters

Schreiber¹,

Müller²

2019

Preprint

View full text Add to dashboard Cite

In this article we explore how the different semantics of spectrograms' time and frequency axes can be exploited for musical tempo and key estimation using Convolutional Neural Networks (CNN). By addressing both tasks with the same network architectures ranging from shallow, domain-specific approaches to deep variants with directional filters, we show that axis-aligned architectures perform similarly well as common VGG-style networks developed for computer vision, while being less vulnerable to confounding factors and requiring fewer model parameters.

show abstract

MediaEval AcousticBrainz Genre AllMusic

Bogdanov¹,

Porter²,

Urbano³

et al. 2018

View full text Add to dashboard Cite

The AcousticBrainz Genre Dataset: Multi-Source, Multi-Level, Multi-Label, and Large-Scale

Bogdanov

Porter

Schreiber³

et al. 2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.