2021
DOI: 10.48550/arxiv.2110.09000
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Supervised Metric Learning for Music Structure Features

Abstract: Music structure analysis (MSA) methods traditionally search for musically meaningful patterns in audio: homogeneity, repetition, novelty, and segment-length regularity. Hand-crafted audio features such as MFCCs or chromagrams are often used to elicit these patterns. However, with more annotations of section labels (e.g., verse, chorus, bridge) becoming available, one can use supervised feature learning to make these patterns even clearer and improve MSA performance. To this end, we take a supervised metric lea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 18 publications
(29 reference statements)
0
3
0
Order By: Relevance
“…Segmentation is usually based on criteria such as homogeneity, novelty, repetition and regularity [1]. When performed algorithmically, MSA often relies on similarity criteria within passages of a song summarized in an autosimilarity matrix [2][3][4][5][6][7][8], in which each coefficient represents an estimation of the similarity between two musical fragments.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Segmentation is usually based on criteria such as homogeneity, novelty, repetition and regularity [1]. When performed algorithmically, MSA often relies on similarity criteria within passages of a song summarized in an autosimilarity matrix [2][3][4][5][6][7][8], in which each coefficient represents an estimation of the similarity between two musical fragments.…”
Section: Introductionmentioning
confidence: 99%
“…While similarity between two frames can be obtained from the feature representation of the signal, such as the STFT of the song [2], recent works try to design new representations of the original music, able to capture the similarity between two frames while maintaining a high level of dissimilarity between dissimilar frames [1,[3][4][5][6][7][8]. This generally consists in projecting the data in a new feature space and computing the similarity in the feature space.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation