2022
DOI: 10.48550/arxiv.2206.10805
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Jointist: Joint Learning for Multi-instrument Transcription and Its Applications

Abstract: In this paper, we introduce Jointist, an instrument-aware multi-instrument framework that is capable of transcribing, recognizing, and separating multiple musical instruments from an audio clip. Jointist consists of the instrument recognition module that conditions the other modules: the transcription module that outputs instrument-specific piano rolls, and the source separation module that utilizes instrument information and transcription results. The instrument conditioning is designed for an explicit multii… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 47 publications
(71 reference statements)
0
2
0
Order By: Relevance
“…The objective is to separate and extract the individual musical lines of each instrument, capturing their respective pitches, timings, and contributions to the overall musical texture. Recently, deep learning-based multiinstrument transcription [73][74][75][76][77] has been widely used and is crucial for analyzing and understanding the interactions between different instruments in a polyphonic musical piece. A recently published paper [73] jointly considers the instrument recognition module, the transcription module, and the source separation module, which is capable of transcribing, recognizing, and separating multiple musical instruments from the audio signal.…”
Section: Multi-instrument Transcriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…The objective is to separate and extract the individual musical lines of each instrument, capturing their respective pitches, timings, and contributions to the overall musical texture. Recently, deep learning-based multiinstrument transcription [73][74][75][76][77] has been widely used and is crucial for analyzing and understanding the interactions between different instruments in a polyphonic musical piece. A recently published paper [73] jointly considers the instrument recognition module, the transcription module, and the source separation module, which is capable of transcribing, recognizing, and separating multiple musical instruments from the audio signal.…”
Section: Multi-instrument Transcriptionmentioning
confidence: 99%
“…Recently, deep learning-based multiinstrument transcription [73][74][75][76][77] has been widely used and is crucial for analyzing and understanding the interactions between different instruments in a polyphonic musical piece. A recently published paper [73] jointly considers the instrument recognition module, the transcription module, and the source separation module, which is capable of transcribing, recognizing, and separating multiple musical instruments from the audio signal. Similarly, the work in [74] adapts the concept of computer vision methods like multi-object detection and instance segmentation for multi-instrument note tracking.…”
Section: Multi-instrument Transcriptionmentioning
confidence: 99%
“…[17] MIT 128 SelfAtt Self-supervised In-house dataset, Cerberus4 1 , etc. 2022 Cheuk et al [18] MIT+SS 128 CRNN Supervised Slakh 1…”
Section: Related Workmentioning
confidence: 99%
“…For example, Manilow et al trained a model on both MIT and audio source separation (SS) and found that it performed better on both tasks than the respective single-task models [13]. Cheuk et al used a similar approach and also showed that "jointly trained music transcription and music source separation models are beneficial to each other" [18]. Conversely, Cartwright et al performed both DTM and beat detection on datasets suited for only one of these tasks in order to expand the total amount of training data [9].…”
Section: Tasksand Vocabularymentioning
confidence: 99%