2019
DOI: 10.1109/msp.2018.2868887
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Modal Music Retrieval and Applications: An Overview of Key Methodologies

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 51 publications
(31 citation statements)
references
References 21 publications
(42 reference statements)
0
30
0
Order By: Relevance
“…Because data of different modalities can be treated as identical data in a jointembedding space and trained under a common metric, deep metric learning and joint-embedding techniques perform well together. In MIR-related tasks, deep metric learning succeeds in learning joint representations over several modalities such as a vocal and mix [23], vocal imitation and sound recording [24], [25], animal sounds [26], sheet music and audio spectrograms [27], music and image [28]- [31], and music and video [21], [22]. The target pair for the metric learning described in this paper consists of a vocal track and an accompaniment track.…”
Section: B Self-supervised and Joint-embedding Techniquesmentioning
confidence: 99%
“…Because data of different modalities can be treated as identical data in a jointembedding space and trained under a common metric, deep metric learning and joint-embedding techniques perform well together. In MIR-related tasks, deep metric learning succeeds in learning joint representations over several modalities such as a vocal and mix [23], vocal imitation and sound recording [24], [25], animal sounds [26], sheet music and audio spectrograms [27], music and image [28]- [31], and music and video [21], [22]. The target pair for the metric learning described in this paper consists of a vocal track and an accompaniment track.…”
Section: B Self-supervised and Joint-embedding Techniquesmentioning
confidence: 99%
“…The Erkomaishvili dataset can be used to address a wide range of research questions including technical as well as musicological ones. For example, a cappella vocal music is a challenging scenario for various MIR tasks such as F0estimation (Salamon et al, 2014), onset detection (Böck et al, 2012), and scoretoaudio alignment (Thomas et al, 2012;Arzt, 2016;Müller et al, 2019). In particular, the not equaltempered nature of the Georgian songs and the characteristic pitch slides in traditional Georgian singing constitute challenging test scenarios for MIR algorithms.…”
Section: Applications For Mir and Musicologymentioning
confidence: 99%
“…Content-based systems can be further categorized according to the modalities involved. For an overview of multi-modal music retrieval scenarios, we refer to a survey by Müller et al [34]. In our contribution, we focus on retrieval scenarios, where both query and database documents are audio recordings.…”
Section: Related Workmentioning
confidence: 99%