Proceedings of the 28th ACM International Conference on Multimedia 2020
DOI: 10.1145/3394171.3413721
|View full text |Cite
|
Sign up to set email alerts
|

PopMAG

Abstract: In pop music, accompaniments are usually played by multiple instruments (tracks) such as drum, bass, string and guitar, and can make a song more expressive and contagious by arranging together with its melody. Previous works usually generate multiple tracks separately and the music notes from different tracks not explicitly depend on each other, which hurts the harmony modeling. To improve harmony, in this paper 1 , we propose a novel MUlti-track MIDI representation (MuMIDI), which enables simultaneous multi-t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 52 publications
(5 citation statements)
references
References 12 publications
0
3
0
Order By: Relevance
“…An effective musical representation is essential for learning different music-related tasks, such as music classification [7,27,36,48,55,56], cover song identification [56,58,62,63], music generation [17,18,28,40]. Most of them rely on large amounts of labeled datasets to learn music representations.…”
Section: Music Representation Learningmentioning
confidence: 99%
“…An effective musical representation is essential for learning different music-related tasks, such as music classification [7,27,36,48,55,56], cover song identification [56,58,62,63], music generation [17,18,28,40]. Most of them rely on large amounts of labeled datasets to learn music representations.…”
Section: Music Representation Learningmentioning
confidence: 99%
“…• embedding pooling such as Compound Word (Hsiao et al, 2021) et al, 2021), PopMag (Ren et al, 2020), Sym-phonyNet (Liu et al, 2022) or MMT (Dong et al, 2023). Embeddings of several tokens are merged with a pooling operation.…”
Section: Sequence Length Reduction Strategiesmentioning
confidence: 99%
“…the quality of generated music or the accuracy of classification tasks, and; 2) the efficiency of the models. The former is tackled with more expressive representations (Huang and Yang, 2020;Kermarec et al, 2022;von Rütte et al, 2023;Fradet et al, 2021), and the latter by representations based on either token combinations (Payne, 2019;Donahue et al, 2019), or embedding pooling (Hsiao et al, 2021;Zeng et al, 2021;Ren et al, 2020;Dong et al, 2023), which reduce the overall sequence length.…”
Section: Introductionmentioning
confidence: 99%
“…However, with recent advancements in the field of speech synthesis, deep learning-based approaches have gained significant traction. This means that instead of relying on the conventional approach comprising multiple subprocesses, there has been a notable shift toward the development of end-to-end TTS technology (Ren et al, 2020), which is supported by trained models.…”
Section: Introductionmentioning
confidence: 99%