2019 27th European Signal Processing Conference (EUSIPCO) 2019
DOI: 10.23919/eusipco.2019.8902741
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Chord Estimation Based on a Frame-wise Convolutional Recurrent Neural Network with Non-Aligned Annotations

Abstract: This paper describes a weakly-supervised approach to Automatic Chord Estimation (ACE) task that aims to estimate a sequence of chords from a given music audio signal at the frame level, under a realistic condition that only non-aligned chord annotations are available. In conventional studies assuming the availability of time-aligned chord annotations, Deep Neural Networks (DNNs) that learn frame-wise mappings from acoustic features to chords have attained excellent performance. The major drawback of such frame… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…Because typical DNN-based methods estimate the posterior probabilities of chord labels at the frame level, some smoothing technique is often used for estimating temporallycoherent chord labels. An HMM [16] or a conditional random field (CRF) [15], for example, can be used for estimating the optimal path of chord labels from the estimated posterior probabilities. Recurrent neural networks (RNNs) have recently been used as a language model that represents the long-term dependency of chord labels [17], [18].…”
Section: B Discriminative Approachmentioning
confidence: 99%
“…Because typical DNN-based methods estimate the posterior probabilities of chord labels at the frame level, some smoothing technique is often used for estimating temporallycoherent chord labels. An HMM [16] or a conditional random field (CRF) [15], for example, can be used for estimating the optimal path of chord labels from the estimated posterior probabilities. Recurrent neural networks (RNNs) have recently been used as a language model that represents the long-term dependency of chord labels [17], [18].…”
Section: B Discriminative Approachmentioning
confidence: 99%
“…Recent work has successfully integrated both stages into a single system that is capable of learning musically meaningful features from a spectrogram-like representation and modeling temporal relations between frames. Generally, they combine CNN and RNN (McFee and Bello 2017;Jiang et al 2019;Wu, Carsault, and Yoshii 2019), although the work presented by Korzeniowski and Widmer (2016) implemented conditional random fields for sequence decoding.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Modern approaches to ACT tend to be based on deep learning, which can effectively combine the extraction of musically relevant features and a sequence analysis that provides temporal coherence to the chord predictions (McFee and Bello 2017;Jiang et al 2019;Wu et al 2019). In comparison with systems that are not integrated, performing each step as an independent process, this kind of architecture allows us to input data with little preprocessing and to directly output class probability for each chord (or each chord component).…”
Section: Chord Transcriptionmentioning
confidence: 99%