The general goal of music synchronization is to automatically align the multiple information sources such as audio recordings, MIDI files, or digitized sheet music related to a given musical work. In computing such alignments, one typically has to face a delicate tradeoff between robustness and accuracy. In this paper, we introduce novel audio features that combine the high temporal accuracy of onset features with the robustness of chroma features. We show how previous synchronization methods can be extended to make use of these new features. We report on experiments based on polyphonic Western music demonstrating the improvements of our proposed synchronization framework.
The extraction of local tempo and beat information from audio recordings constitutes a challenging task, particularly for music that reveals significant tempo variations. Furthermore, the existence of various pulse levels such as measure, tactus, and tatum often makes the determination of absolute tempo problematic. In this paper, we present a robust mid-level representation that encodes local tempo information. Similar to the well-known concept of cyclic chroma features, where pitches differing by octaves are identified, we introduce the concept of cyclic tempograms, where tempi differing by a power of two are identified. Furthermore, we describe how to derive cyclic tempograms from music signals using two different methods for periodicity analysis and finally sketch some applications to tempo-based audio segmentation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.