Automatic Chord Estimation Based on a Frame-wise Convolutional Recurrent Neural Network with Non-Aligned Annotations

Wu, Yiming; Carsault, Tristan; Yoshii, Kazuyoshi

doi:10.23919/eusipco.2019.8902741

Cited by 8 publications

(3 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Because typical DNN-based methods estimate the posterior probabilities of chord labels at the frame level, some smoothing technique is often used for estimating temporallycoherent chord labels. An HMM [16] or a conditional random field (CRF) [15], for example, can be used for estimating the optimal path of chord labels from the estimated posterior probabilities. Recurrent neural networks (RNNs) have recently been used as a language model that represents the long-term dependency of chord labels [17], [18].…”

Section: B Discriminative Approachmentioning

confidence: 99%

Semi-supervised Neural Chord Estimation Based on a Variational Autoencoder with Latent Chord Labels and Features

Carsault

Nakamura

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

This paper describes a statistically-principled semisupervised method of automatic chord estimation (ACE) that can make effective use of any music signals regardless of the availability of chord annotations. The typical approach to ACE is to train a deep classification model (neural chord estimator) in a supervised manner by using only a limited amount of annotated music signals. In this discriminative approach, prior knowledge about chord label sequences (characteristics of model output) has scarcely been taken into account. In contract, we propose a unified generative and discriminative approach in the framework of amortized variational inference. More specifically, we formulate a deep generative model that represents the complex generative process of chroma vectors (observed variables) from the discrete labels and continuous textures of chords (latent variables). Chord labels and textures are assumed to follow a Markov model favoring self-transitions and a standard Gaussian distribution, respectively. Given chroma vectors as observed data, the posterior distributions of latent chord labels and textures are computed approximately by using deep classification and recognition models, respectively. These three models are combined to form a variational autoencoder and trained jointly in a semi-supervised manner. The experimental results show that the performance of the classification model can be improved by additionally using non-annotated music signals and/or by regularizing the classification model with the Markov model of chord labels and the generative model of chroma vectors even in the fully-supervised condition.

show abstract

Section: B Discriminative Approachmentioning

confidence: 99%

Semi-supervised Neural Chord Estimation Based on a Variational Autoencoder with Latent Chord Labels and Features

Carsault

Nakamura

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Recent work has successfully integrated both stages into a single system that is capable of learning musically meaningful features from a spectrogram-like representation and modeling temporal relations between frames. Generally, they combine CNN and RNN (McFee and Bello 2017;Jiang et al 2019;Wu, Carsault, and Yoshii 2019), although the work presented by Korzeniowski and Widmer (2016) implemented conditional random fields for sequence decoding.…”

Section: Literature Reviewmentioning

confidence: 99%

“…Modern approaches to ACT tend to be based on deep learning, which can effectively combine the extraction of musically relevant features and a sequence analysis that provides temporal coherence to the chord predictions (McFee and Bello 2017;Jiang et al 2019;Wu et al 2019). In comparison with systems that are not integrated, performing each step as an independent process, this kind of architecture allows us to input data with little preprocessing and to directly output class probability for each chord (or each chord component).…”

Section: Chord Transcriptionmentioning

confidence: 99%

Transcribing Lead Sheet-Like Chord Progressions of Jazz Recordings

Durán

Cuadra

2020

Computer Music Journal

View full text Add to dashboard Cite

The vast majority of research on automatic chord transcription has been developed and tested on databases mainly focused on genres like pop and rock. Jazz is strongly based on improvisation, however, and the way harmony is interpreted is different from many other genres, causing state-of-the-art chord transcription systems to achieve poor performance. This article presents a computational system that transcribes chords from jazz recordings, addressing the specific challenges they present and considering their inherent musical aspects. Taking the raw audio and minor manually obtained inputs from the user, the system can jointly transcribe chords and detect the beat of a recording, allowing a lead sheet–like rendering as output. The analysis is implemented in two parts. First, all segments with a repeating chord progression (the chorus) are aligned based on their musical content using dynamic time warping. Second, the aligned segments are mixed and a convolutional recurrent neural network is used to simultaneously detect beats and transcribe chords. This automatic chord transcription system is trained and tested on jazz recordings only, and achieves better performance than other systems trained on larger databases that are not jazz specific. Additionally, it combines the beat-detection and chord transcription tasks, allowing the creation of a lead sheet–like representation that is easy to interpret by both researchers and musicians.

show abstract

RS-pCloud: A Peer-to-Peer Based Edge-Cloud System for Fast Remote Sensing Image Processing

Sun¹,

Xiong²,

Wang

et al. 2020

2020 IEEE International Conference on Edge Computing (EDGE)

View full text Add to dashboard Cite

Automatic Chord Estimation Based on a Frame-wise Convolutional Recurrent Neural Network with Non-Aligned Annotations

Cited by 8 publications

References 9 publications

Semi-supervised Neural Chord Estimation Based on a Variational Autoencoder with Latent Chord Labels and Features

Semi-supervised Neural Chord Estimation Based on a Variational Autoencoder with Latent Chord Labels and Features

Transcribing Lead Sheet-Like Chord Progressions of Jazz Recordings

RS-pCloud: A Peer-to-Peer Based Edge-Cloud System for Fast Remote Sensing Image Processing

Contact Info

Product

Resources

About