2020
DOI: 10.5334/tismir.41
|View full text |Cite
|
Sign up to set email alerts
|

Unveiling the Hierarchical Structure of Music by Multi-Resolution Community Detection

Abstract: Human perception of musical structure is supposed to depend on the generation of hierarchies, which is inherently related to the actual organisation of sounds in music. Musical structures are indeed best retained by listeners when they form hierarchical patterns, with consequent implications on the appreciation of music and its performance. The automatic detection of musical structure in audio recordings is one of the most challenging problems in the field of music information retrieval, since even human exper… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 13 publications
(16 citation statements)
references
References 29 publications
(47 reference statements)
0
15
0
Order By: Relevance
“…where 𝑥 𝑖 ∈ X is a column vector; 𝑛, 𝑚 ∈ [0 : 𝑑 − 1]; 13 is a normalisation factor (the maximum TPS value); and the subtraction from 1 is used to obtain a similarity score from a distance measure. Self-similarity matrices have been extensively used for structure analysis, due to their ability to reveal nested structural elements [27,17] As can be seen from Figure 2, block-like structures are observed when the underlying sequence shows homogeneous features over the duration of the corresponding segment. Often, such a homogeneous segment is followed by another homogeneous segment that stands in contrast to the previous one.…”
Section: S(𝑛mentioning
confidence: 99%
“…where 𝑥 𝑖 ∈ X is a column vector; 𝑛, 𝑚 ∈ [0 : 𝑑 − 1]; 13 is a normalisation factor (the maximum TPS value); and the subtraction from 1 is used to obtain a similarity score from a distance measure. Self-similarity matrices have been extensively used for structure analysis, due to their ability to reveal nested structural elements [27,17] As can be seen from Figure 2, block-like structures are observed when the underlying sequence shows homogeneous features over the duration of the corresponding segment. Often, such a homogeneous segment is followed by another homogeneous segment that stands in contrast to the previous one.…”
Section: S(𝑛mentioning
confidence: 99%
“…Classical music form analysis is a difficult task well-known to musicians and audio researchers alike, given the overall complexity of musical form design and the theoretical knowledge required to analyze pieces of music both at the phrase-and part-level, and large form classification. While there have been several various attempts utilizing novelty methods on audio features [6], community detection algorithms [1], and neural networks [7], none have proven to be sufficient for truly complex analysis tasks. This issue is especially true for more advanced musical forms such as sonata-allegro form where repetition of musical features is not nearly as clear as simpler Recurrent Neural Networks (RNNs) [7], resulting in an upper limit to the accuracy a model could achieve regardless of the training material [5].…”
Section: Motivationmentioning
confidence: 99%
“…This article presents a discussion on the challenge of automatic musical structure detection in audio recordings and the issue of most current algorithms being only able to produce flat segmentations which lack the ability to segment the music across multiple levels to reveal the piece's hierarchical structure [1]. The authors propose a new approach based on multi-resolution community detection and graph theory to create a new unsupervised learning task, which can perform both boundary detection and structural grouping without the need for specified constraints that limit the output segmentation.…”
Section: De Berardinis Vamvakaris Cangelosi and Coutinho 2020mentioning
confidence: 99%
See 1 more Smart Citation
“…Devising a suitable feature for MSA is challenging, since so many aspects of music-including pitch, timbre, rhythm, and dynamics-impact the perception of structure [4]. Some methods have aimed to combine input from multiple features [5], but this requires care: MSA researchers have long been aware that structure at different timescales can be reflected best by different features (see, e.g., [6]).…”
Section: Introductionmentioning
confidence: 99%