2020
DOI: 10.48550/arxiv.2004.01525
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards democratizing music production with AI-Design of Variational Autoencoder-based Rhythm Generator as a DAW plugin

Nao Tokui

Abstract: There has been significant progress in the music generation technique utilizing deep learning. However, it is still hard for musicians and artists to use these techniques in their daily music-making practice.This paper proposes a Variational Autoencoder[8](VAE)based rhythm generation system, in which musicians can train a deep learning model only by selecting target MIDI files, then generate various rhythms with the model. The author has implemented the system as a plugin software for a DAW (Digital Audio Work… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…Several added two more instruments (usually mapped to cymbals), and a few incorporated three more instruments (in general mapped to toms, cowbell, and percussion), adding up to a total of nine instruments. This number of instruments is the same found in previous research on drum sound classification (Herrera, Yeterian, & Gouyon, 2002), and used in implementations by Tokui (2020) and Gillick et al (2019). As a result, our chosen data representation for the encoding of one bar of 4/4 time comprises three vectors (for onsets, velocities, and microtimings) of size 864.…”
Section: Neural Network Architecturementioning
confidence: 80%
See 2 more Smart Citations
“…Several added two more instruments (usually mapped to cymbals), and a few incorporated three more instruments (in general mapped to toms, cowbell, and percussion), adding up to a total of nine instruments. This number of instruments is the same found in previous research on drum sound classification (Herrera, Yeterian, & Gouyon, 2002), and used in implementations by Tokui (2020) and Gillick et al (2019). As a result, our chosen data representation for the encoding of one bar of 4/4 time comprises three vectors (for onsets, velocities, and microtimings) of size 864.…”
Section: Neural Network Architecturementioning
confidence: 80%
“…The application is still being tested and is not open to the public. Finally, M4L.RhythmVAE (Tokui, 2020) is a rhythm generation system that encodes onsets, velocities, and microtimings. It is based on GrooVAE but has a much simpler network architecture for faster training.…”
Section: Rhythm and Latent Spacesmentioning
confidence: 99%
See 1 more Smart Citation