The goal of music highlight extraction, or thumbnailing, is to extract a short consecutive segment of a piece of music that is somehow representative of the whole piece. In a previous work, we introduced an attention-based convolutional recurrent neural network that uses music emotion classification as a surrogate task for music highlight extraction, assuming that the most emotional part of a song usually corresponds to the highlight. This paper extends our previous work in the following two aspects. First, methodology-wise we experiment with a new architecture that does not need any recurrent layers, making the training process faster. Moreover, we compare a late-fusion variant and an early-fusion variant to study which one better exploits the attention mechanism. Second, we conduct and report an extensive set of experiments comparing the proposed attention-based methods to a heuristic energy-based method, a structural repetition-based method, and three other simple feature-based methods, respectively. Due to the lack of public-domain labeled data for highlight extraction, following our previous work we use the RWC-Pop 100-song data set to evaluate how the detected highlights overlap with any chorus sections of the songs. The experiments demonstrate superior effectiveness of our methods over the competing methods. For reproducibility, we share the code and the pre-trained model at https://github. com/remyhuang/pop-music-highlighter/.
Generating music medleys is about finding an optimal permutation of a given set of music clips. Toward this goal, we propose a self-supervised learning task, called the music puzzle game, to train neural network models to learn the sequential patterns in music. In essence, such a game requires machines to correctly sort a few multisecond music fragments. In the training stage, we learn the model by sampling multiple non-overlapping fragment pairs from the same songs and seeking to predict whether a given pair is consecutive and is in the correct chronological order. For testing, we design a number of puzzle games with different difficulty levels, the most difficult one being music medley, which requiring sorting fragments from different songs. On the basis of state-of-the-art Siamese convolutional network, we propose an improved architecture that learns to embed frame-level similarity scores computed from the input fragment pairs to a common space, where fragment pairs in the correct order can be more easily identified. Our result shows that the resulting model, dubbed as the similarity embedding network (SEN), performs better than competing models across different games, including music jigsaw puzzle, music sequencing, and music medley. Example results can be found at our project website, https://remyhuang.github.io/DJnet.
This study investigates the bandgap tunability of a metashaft with periodic shunted piezoelectric rings. An analytical model based on the Bloch theory and the transfer matrix technique is developed to predict bandgap characteristics and the transmission of torsional vibration in the proposed structure. The location and width of the bandgap can be easily tailored by altering the electric parameters of shunt circuits. Compared with other shunt assemblies, the use of the negative capacitance resonant shunt enables the creation of a lower bandgap with a relatively wide bandwidth. Bandgap frequencies have a strong dependency on the inductance and capacitance but not on resistance. Moreover, the frequency with an unbounded loss factor coincides with the one of maximum attenuation. Multiple bandgaps can be achieved by adding extra sets of shunt circuits to the metashaft. Theoretical results have been validated by comparing them with finite element results. Our findings provide feasible guidelines in the design of torsional active control systems.
A novel cardiokymograph system is introduced. The new system features a capacitance transducer with increased sensitivity and can be used in multichannel measurements. The novelty of this technique is the injection of a current into the patient coupled with the use of a capacitive displacement transducer and the possibility of multichannel monitoring. It provides for the possibility of removing breath noise when some signal processing technique, such as adaptive filtering, is used. Further investigation is needed to demonstrate clinical significance and pathologies.
Generating music medleys is about finding an optimal permutation of a given set of music clips. Toward this goal, we propose a self-supervised learning task, called the music puzzle game, to train neural network models to learn the sequential patterns in music. In essence, such a game requires machines to correctly sort a few multisecond music fragments. In the training stage, we learn the model by sampling multiple nonoverlapping fragment pairs from the same songs and seeking to predict whether a given pair is consecutive and is in the correct chronological order. For testing, we design a number of puzzle games with different difficulty levels, the most difficult one being music medley, which requiring sorting fragments from different songs. On the basis of state-of-the-art Siamese convolutional network, we propose an improved architecture that learns to embed frame-level similarity scores computed from the input fragment pairs to a common space, where fragment pairs in the correct order can be more easily identified. Our result shows that the resulting model, dubbed as the similarity embedding network (SEN), performs better than competing models across different games, including music jigsaw puzzle, music sequencing, and music medley. Example results can be found at our project website, https://remyhuang.github.io/DJnet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.