A natural way of handling imbalanced data is to attempt to equalise the class frequencies and train the classifier of choice on balanced data. For two-class imbalanced problems, the classification success is typically measured by the geometric mean (GM) of the true positive and true negative rates. Here we prove that GM can be improved upon by instance selection, and give the theoretical conditions for such an improvement. We demonstrate that GM is non-monotonic with respect to the number of retained instances, which discourages systematic instance selection. We also show that balancing the distribution frequencies is inferior to a direct maximisation of GM. To verify our theoretical findings, we carried out an experimental study of 12 instance selection methods for imbalanced data, using 66 standard benchmark data sets. The results reveal possible room for new instance selection methods for imbalanced data.
This study brings together systematised views of two related areas: data editing for the nearest neighbour classifier and adaptive learning in the presence of concept drift. The growing number of studies in the intersection of these areas warrants a closer look. We revise and update the taxonomies of the two areas proposed in the literature and argue that they are not sufficiently discriminative with respect to methods for prototype selection and prototype generation in the presence of concept drift. We proceed to create a bespoke taxonomy of these methods and illustrate it with ten examples from the literature. The new taxonomy can serve as a road-map for researching the intersection area and inform the development of new methods.
Large numbers of data streams are today generated in many fields. A key challenge when learning from such streams is the problem of concept drift. Many methods, including many prototype methods, have been proposed in recent years to address this problem. This paper presents a refined taxonomy of instance selection and generation methods for the classification of data streams subject to concept drift. The taxonomy allows discrimination among a large number of methods which pre-existing taxonomies for offline instance selection methods did not distinguish. This makes possible a valuable new perspective on experimental results, and provides a framework for discussion of the concepts behind different algorithm-design approaches. We review a selection of modern algorithms for the purpose of illustrating the distinctions made by the taxonomy. We present the results of a numerical experiment which examined the performance of a number of representative methods on both synthetic and real-world data sets with and without concept drift, and discuss the implications for the directions of future research in light of the taxonomy. On the basis of the experimental results, we are able to give recommendations for the experimental evaluation of algorithms which may be proposed in the future.
This article is aimed at two groups of readers. First, we present an interactive guide to pitch on the pedal harp for anyone wishing to teach or learn about harp pedaling and its associated pitch possibilities. We originally created this in response to a pedagogical need for such a resource in the teaching of composition and orchestration.Secondly, for composers and theorists seeking a more comprehensive understanding of what can be done on this unique instrument, we present a range of empirical-theoretical observations about the properties and prevalence of pitch structures on the pedal harp and the routes among them. This is particularly relevant to those interested in extended-tonal and atonal repertoires. A concluding section discusses prospective theoretical developments and analytical applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.