2012
DOI: 10.1109/tasl.2012.2188516
|View full text |Cite
|
Sign up to set email alerts
|

An End-to-End Machine Learning System for Harmonic Analysis of Music

Abstract: We present a new system for simultaneous estimation of keys, chords, and bass notes from music audio. It makes use of a novel chromagram representation of audio that takes perception of loudness into account. Furthermore, it is fully based on machine learning (instead of expert knowledge), such that it is potentially applicable to a wider range of genres as long as training data is available. As compared to other models, the proposed system is fast and memory efficient, while achieving state-of-the-art perform… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 47 publications
(37 citation statements)
references
References 19 publications
(37 reference statements)
0
35
0
Order By: Relevance
“…Many features used, such as non-negative least squares(NNLS) [7], chroma DCT-reduced log pitch(CRP) [8], loudness based chromagram(LBC) [9], Mel PCP(MPCP) [10]. For auto chord analysis, the most popular feature is a chromagram, also known as chroma vectors or Pitch Class Profile (PCP).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Many features used, such as non-negative least squares(NNLS) [7], chroma DCT-reduced log pitch(CRP) [8], loudness based chromagram(LBC) [9], Mel PCP(MPCP) [10]. For auto chord analysis, the most popular feature is a chromagram, also known as chroma vectors or Pitch Class Profile (PCP).…”
Section: Related Workmentioning
confidence: 99%
“…Besides templates-fitting methods, it is widely used machine-learning methods such as hidden Markov Model (HMM) [16][17][18][19][20] and DBNs(Dynamic Bayesian Networks) [7,9] for this recognition process.…”
Section: Related Workmentioning
confidence: 99%
“…The first one has been used in editions 2010, 2011 and 2012 and is named "Isophonics" [3]. It consists for the main part of Beatles songs (180) with some additional songs by Queen (19) and Zweieck (18). The second one has only been used in 2012 and is called "Billboard" [4].…”
Section: A New Look At Existing Datamentioning
confidence: 99%
“…The features x that we use for the calculation of the acoustic probabilities P (xt|st, kt, ct) are the Loudness Based Chromagrams as developed by Ni et al [7]. These are 24-dimensional vectors that represent the loudness of each of the 12 pitch classes in both the treble and the bass spectrum.…”
Section: A Probabilistic Framework For the Joint Estimation Of Structmentioning
confidence: 99%