A Novel Query-by-Singing/Humming Method by Estimating Matching Positions Based on Multi-layered Perceptron
Abstract:The increase in the number of music files in smart phone and MP3 player makes it difficult to find the music files which people want. So, Query-by-Singing/Humming (QbSH) systems have been developed to retrieve music from a user's humming or singing without having to know detailed information about the title or singer of song. Most previous researches on QbSH have been conducted using musical instrument digital interface (MIDI) files as reference songs. However, the production of MIDI files is a time-consuming … Show more
“…A fast and scalable baseline system has been designed to demonstrate the efficiency of the proposed retrieval method in juxtaposition. The architecture for the baseline is a simple Multi Layer Perceptron (MLP), composed of 5 layers with each layer fully connected to the previous layer [10]. The last layer of the network is a softmax classifier, fully connected to the output of its previous layer.…”
A Query-By-Humming (QBH) system constitutes a particular case of music information retrieval where the input is a user-hummed melody and the output is the original song which contains that melody. A typical QBH system consists of melody extraction and candidate melody retrieval.For melody extraction, accurate note transcription is the key enabling technology. However, current transcription methods are unable to definitively capture the melody and address inaccuracies in user-hummed queries. In this paper, we incorporate Total Variation Regularization (TVR) to denoise queries. This approach accounts for user error in humming without loss of meaningful data and reliably captures the underlying melody.For candidate melody retrieval, we employ a deep learning approach to time series classification using a Fully Convolutional Neural Network. The trained network classifies the incoming query as belonging to one of the target songs.For our experiments, we use Roger Jang's MIR-QBSH dataset which is the standard MIREX dataset. We demonstrate that inclusion of TVR denoised queries in the training set enhances the overall accuracy of the system to 93% which is higher than other state-ofthe-art QBH systems.
“…A fast and scalable baseline system has been designed to demonstrate the efficiency of the proposed retrieval method in juxtaposition. The architecture for the baseline is a simple Multi Layer Perceptron (MLP), composed of 5 layers with each layer fully connected to the previous layer [10]. The last layer of the network is a softmax classifier, fully connected to the output of its previous layer.…”
A Query-By-Humming (QBH) system constitutes a particular case of music information retrieval where the input is a user-hummed melody and the output is the original song which contains that melody. A typical QBH system consists of melody extraction and candidate melody retrieval.For melody extraction, accurate note transcription is the key enabling technology. However, current transcription methods are unable to definitively capture the melody and address inaccuracies in user-hummed queries. In this paper, we incorporate Total Variation Regularization (TVR) to denoise queries. This approach accounts for user error in humming without loss of meaningful data and reliably captures the underlying melody.For candidate melody retrieval, we employ a deep learning approach to time series classification using a Fully Convolutional Neural Network. The trained network classifies the incoming query as belonging to one of the target songs.For our experiments, we use Roger Jang's MIR-QBSH dataset which is the standard MIREX dataset. We demonstrate that inclusion of TVR denoised queries in the training set enhances the overall accuracy of the system to 93% which is higher than other state-ofthe-art QBH systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.