2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018
DOI: 10.1109/icassp.2018.8461914
|View full text |Cite
|
Sign up to set email alerts
|

Towards Complete Polyphonic Music Transcription: Integrating Multi-Pitch Detection and Rhythm Quantization

Abstract: Most work on automatic transcription produces "piano roll" data with no musical interpretation of the rhythm or pitches. We present a polyphonic transcription method that converts a music audio signal into a human-readable musical score, by integrating multi-pitch detection and rhythm quantization methods. This integration is made difficult by the fact that the multi-pitch detection produces erroneous notes such as extra notes and introduces timing errors that are added to temporal deviations due to musical ex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
38
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
1

Relationship

5
3

Authors

Journals

citations
Cited by 43 publications
(44 citation statements)
references
References 17 publications
0
38
0
Order By: Relevance
“…A dataset that best fits this task is the recently published ASAP dataset [18], which will be investigated as future work. We collect scores in MusicXML format, convert them to MIDI files, and synthesize audio files using four piano models using the Native Instruments Kontakt Player 4 . The scores we collect cover various key and time signatures, tempos, modes and polyphony levels, but do not contain grace notes, triplets, arpeggios, trios or other complex playing techniques.…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation
“…A dataset that best fits this task is the recently published ASAP dataset [18], which will be investigated as future work. We collect scores in MusicXML format, convert them to MIDI files, and synthesize audio files using four piano models using the Native Instruments Kontakt Player 4 . The scores we collect cover various key and time signatures, tempos, modes and polyphony levels, but do not contain grace notes, triplets, arpeggios, trios or other complex playing techniques.…”
Section: Datamentioning
confidence: 99%
“…The recent literature has mainly focused on two approaches for complete transcription: 1) traditional methods transcribe music audio step by step in the order of subtasks [4,5], and 2) end-to-end L. Liu is a research student at the UKRI Centre for Doctoral Training in Artificial Intelligence and Music, supported jointly by the China Scholarship Council and Queen Mary University of London.…”
Section: Introductionmentioning
confidence: 99%
“…To evaluate the proposed method, we calculated the pitch error rate Ep, the extra note rate Ee, the missing note rate Em, the onset-time error rate Eon, the offset-time error rate E of f , and the overall error rate E all [17] by comparing transcribed and corrected sequences with the ground-truth sequences. The musical naturalness was evaluated in terms of the rate of diatonic notes R dn because the majority of notes should be on a scale.…”
Section: Experimental Conditionsmentioning
confidence: 99%
“…[23,24]. Especially in piano transcription, results of multi-pitch detection contain a significant amount of spurious notes (false positives), which often make the transcription results unplayable [25]. By integrating the present piano-score model and an acoustic model (instead of the edit model) and applying the method for optimization developed in this study, one can impose constraints on performance difficulty of transcription results and reduce these spurious notes.…”
Section: C O N C L U S I O Nmentioning
confidence: 99%