2012
DOI: 10.48550/arxiv.1206.6392
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
71
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
2
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 52 publications
(78 citation statements)
references
References 0 publications
1
71
0
Order By: Relevance
“…In this section, we evaluate the performance of our approach and compare it to state-of-the-art unitary recurrent models such as uRNN [2], euRNN [4], fcuRNN [3], expRNN [10], nnRNN [12], and RNN [52]. We focus on three learning tasks that are commonly used for benchmarking, the copy task [53], the polyphonic music task on the JSB and MuseData datasets [54,55], the TIMIT speech prediction problem [56], and the character level prediction task on the PTB dataset [57]. We chose to focus on these tasks as they require from the modeling architecture long-term memory capabilities and relatively high expressivity.…”
Section: Methodsmentioning
confidence: 99%
“…In this section, we evaluate the performance of our approach and compare it to state-of-the-art unitary recurrent models such as uRNN [2], euRNN [4], fcuRNN [3], expRNN [10], nnRNN [12], and RNN [52]. We focus on three learning tasks that are commonly used for benchmarking, the copy task [53], the polyphonic music task on the JSB and MuseData datasets [54,55], the TIMIT speech prediction problem [56], and the character level prediction task on the PTB dataset [57]. We chose to focus on these tasks as they require from the modeling architecture long-term memory capabilities and relatively high expressivity.…”
Section: Methodsmentioning
confidence: 99%
“…However, our proposed R-Transformer that leverages LocalRNN to incorporate local information, has achieved better performance than TCN. (Bai et al, 2018) -1.37 LSTM (Bai et al, 2018) 2 /600 1.36 TCN (Bai et al, 2018) 3 Next, we evaluate R-Transformer on the task of polyphonic music modeling with Nottingham dataset (Boulanger-Lewandowski et al, 2012). This dataset collects British and American folk tunes and has been commonly used in previous works to investigate the model's ability for polyphonic music modeling (Boulanger-Lewandowski et al, 2012;Chung et al, 2014;Bai et al, 2018).…”
Section: Pixel-by-pixel Mnist: Sequence Classificationmentioning
confidence: 99%
“…More generally, in both music and speech, various combinations of recurrent convolutional neural networks have been successfully adopted in audio signal processing and Music Information Retrieval (MIR) applications. [15] applies RNNs coupled with restricted Bolzmann machines to polyphonic pitch transcription. In [16], a Convolutional Gated Recurrent Unit (CGRU), where the GRU [17] structure, an adaptation of RNNs that addresses the gradient vanishing problem, estimates the main melody in polyphonic audio signals pre-processed using the Constant-Q Transform (CQT) followed by Nonnegative Matrix Factorization [18].…”
Section: Deep Learning Research In Music Signal Processingmentioning
confidence: 99%