“…To achieve scalability and high-speed similarity retrieval, we built a search engine with distributed indices, which we call Keren [7]. All queries are processed in parallel with multiple threads that are generated when Keren is initialized.…”
Section: Retrieval Engine and Similarity Measurementmentioning
confidence: 99%
“…The previous version of SoundCompass [7,8] needed a metronome to time-normalize the humming data. The Sound-Compass does not need one because song data and hummings are time-normalized automatically.…”
This paper describes our practical query-by-humming system, SoundCompass, which is being used as a karaoke song selection system in Japan. First, we describe the fundamental techniques employed by SoundCompass such as normalization in a time-wise sense of music data, time-scalable and tone-shiftable time-series data, and making subsequences for efficient matching. Second, we describe techniques to make effective feature vectors based on real music data and do matching with them to develop accurate query-by-humming. Third, we share valuable knowledge that has been obtained through month's of practical use of SoundCompass. Fourth, we describe the latest version of the SoundCompass system that incorporates these new techniques and knowledge, as well as describe quantitative evaluations that prove the practicality of SoundCompass. The new system provides flexible and accurate similarity retrieval based on k-nearest neighbor searches with multi-dimensional spatial indices structured with multi-dimensional feature vectors.
“…To achieve scalability and high-speed similarity retrieval, we built a search engine with distributed indices, which we call Keren [7]. All queries are processed in parallel with multiple threads that are generated when Keren is initialized.…”
Section: Retrieval Engine and Similarity Measurementmentioning
confidence: 99%
“…The previous version of SoundCompass [7,8] needed a metronome to time-normalize the humming data. The Sound-Compass does not need one because song data and hummings are time-normalized automatically.…”
This paper describes our practical query-by-humming system, SoundCompass, which is being used as a karaoke song selection system in Japan. First, we describe the fundamental techniques employed by SoundCompass such as normalization in a time-wise sense of music data, time-scalable and tone-shiftable time-series data, and making subsequences for efficient matching. Second, we describe techniques to make effective feature vectors based on real music data and do matching with them to develop accurate query-by-humming. Third, we share valuable knowledge that has been obtained through month's of practical use of SoundCompass. Fourth, we describe the latest version of the SoundCompass system that incorporates these new techniques and knowledge, as well as describe quantitative evaluations that prove the practicality of SoundCompass. The new system provides flexible and accurate similarity retrieval based on k-nearest neighbor searches with multi-dimensional spatial indices structured with multi-dimensional feature vectors.
“…W HEN users want to locate specific music data from the huge volumes contained in music databases, they usually input bibliographic keywords, such as title and artist. When they do not have any bibliographic keywords, they can use content-based music-retrieval systems that enable them to find data by singing the song, typing parts of the lyrics, or humming the tune [1], [2], [3], [4]. However, these systems are ineffective if they do not specify the exact music data they want to find.…”
Impression-based music retrieval is the best way to find pieces of music that suit the preferences, senses, or mental states of users. A natural language interface (NLI) is more useful and effective than a graphical user interface for impression-based music retrieval since an NLI interprets users' spontaneous input sentences to represent musical impressions and generates query vectors for music retrieval. Existing impression-based music retrieval systems, however, have no dialogue capabilities for modifying the most recently used query vector. We evaluated a natural language dialogue system we developed that deals not only with 164 impression words but also with 14 comparative expressions, such as "a little more" and "more and more," and, if necessary, modifies the most recently used query vector through a dialogue. We also evaluated performance using 35 participants to determine the effectiveness of our dialogue system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.