2020
DOI: 10.48550/arxiv.2010.01815
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

High-resolution Piano Transcription with Pedals by Regressing Onset and Offset Times

Abstract: Automatic music transcription (AMT) is the task of transcribing audio recordings into symbolic representations such as Musical Instrument Digital Interface (MIDI). Recently, neural networks based methods have been applied to AMT, and have achieved state-of-the-art result. However, most of previous AMT systems predict the presence or absence of notes in the frames of audio recordings. The transcription resolution of those systems are limited to the hop size time between adjacent frames. In addition, previous AM… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 29 publications
0
12
0
Order By: Relevance
“…5 ms in Handel [34]). We leave open the possibility that our results could be improved further with finer event resolution, for example by predicting continuous times as in Kong et al [19].…”
Section: Inputs and Outputsmentioning
confidence: 92%
See 2 more Smart Citations
“…5 ms in Handel [34]). We leave open the possibility that our results could be improved further with finer event resolution, for example by predicting continuous times as in Kong et al [19].…”
Section: Inputs and Outputsmentioning
confidence: 92%
“…Kong et al [19] achieve higher transcription accuracy by using regression to predict precise continuous onset/offset times, using a similar network architecture to Hawthorne et al [3]. Kim & Bello [20] use an adversarial loss on the transcription output to encourage a transcription model to output more plausible piano rolls.…”
Section: Related Work 21 Piano Transcriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…We use the GiantMIDI-Piano dataset [17], which includes 10,854 piano performances written by 2,786 composer transcribed using [18] from live recordings and encoded in the MIDI format. We use a (90/10/0) split for the dataset, by assigning one file every ten to the validation set.…”
Section: Datasetmentioning
confidence: 99%
“…As a result, there are a large number of transcription models whose success relies on hand-designed representations for piano transcription. For instance, the Onsets & Frames model (Hawthorne et al, 2017) uses dedicated outputs for detecting piano onsets and the note being played; Kelz et al (2019) represents the entire amplitude envelope of a piano note; and Kong et al (2020) additionally models piano foot pedal events (a piano-specific way of controlling a note's sustain). Single-instrument transcription models have also been developed for other instruments such as guitar (Xi et al, 2018) and drums (Cartwright & Bello, 2018;Callender et al, 2020), though these instruments have received less attention than piano.…”
Section: Music Transcriptionmentioning
confidence: 99%