This paper discusses two sets of automatic musical genre classification experiments. Promising research directions are then proposed based on the results of these experiments.The first set of experiments was designed to examine the utility of combining features extracted from separate and independent audio, symbolic and cultural sources of musical information. The results from this set of experiments indicate that combining feature types can indeed substantively improve classification accuracy as well as reduce the seriousness of those misclassifications that do occur.The second set of experiments examined which high-level features were most important in successfully classifying symbolic data. It was found that features associated with instrumentation were particularly effective.The paper also presents the jMIR toolset, which was used to carry out these experiments and which is particularly well suited to combining information extracted from different types of data sources. jMIR is a free and open-source software suite designed for applications related to automatic music classification of various kinds.
Every type of musical data (audio, symbolic, lyrics, etc.) has its limitations, and cannot always capture all relevant properties of a particular musical category. In contrast to more typical MIR setups where supervised classification models are trained on only one or two types of data, we propose a more diversified approach to music classification and analysis based on six modalities: audio signals, semantic tags inferred from the audio, symbolic MIDI representations, album cover images, playlist co-occurrences, and lyric texts. Some of the descriptors we extract from these data are low-level, while others encapsulate interpretable semantic knowledge that describes melodic, rhythmic, instrumental, and other properties of music. With the intent of measuring the individual impact of different feature groups on different categories, we propose two evaluation criteria based on "non-dominated hypervolumes": multi-group feature "importance" and "redundancy". Both of these are calculated after the application of a multi-objective feature selection strategy using evolutionary algorithms, with a novel approach to optimizing trade-offs between both "pure" and "mixed" feature subsets. These techniques permit an exploration of how different modalities and feature types contribute to class discrimination. We use genre classification as a sample research domain to which these techniques can be applied, and present exploratory experiments on two disjoint datasets of different sizes, involving three genre ontologies of varied class similarity. Our results highlight the potential of combining features extracted from different modalities, and can provide insight on the relative significance of different modalities and features in different contexts.
This chapter includes a critical review of existing file formats that have been used in MIR research. This is followed by a set of design priorities that are proposed for use in developing new formats and improving existing ones. The details of the ACE XML specification are then described in this context. Finally, research priorities for the future are discussed, as well as possible uses for ACE XML outside the specific domain of MIR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.