A natural way of handling imbalanced data is to attempt to equalise the class frequencies and train the classifier of choice on balanced data. For two-class imbalanced problems, the classification success is typically measured by the geometric mean (GM) of the true positive and true negative rates. Here we prove that GM can be improved upon by instance selection, and give the theoretical conditions for such an improvement. We demonstrate that GM is non-monotonic with respect to the number of retained instances, which discourages systematic instance selection. We also show that balancing the distribution frequencies is inferior to a direct maximisation of GM. To verify our theoretical findings, we carried out an experimental study of 12 instance selection methods for imbalanced data, using 66 standard benchmark data sets. The results reveal possible room for new instance selection methods for imbalanced data.
The theoretical background to automata and formal languages represents a complex learning area for students. Computer tools for interacting with the algorithm and interfaces to visualize its different steps can assist the learning process and make it more attractive. In this paper, we present a web application for learning some of the most common algorithms in an appealing way. They are specifically linked to the recognition of regular languages that are, taught in classes on both automata theory and compiler design. Although several simulators are available to students, they usually only serve to validate grammars, automata, and languages, rather than helping students to learn the internal processes that an algorithm can perform. The resource presented here can execute and display each algorithm process, step by step, providing explanations on each step that assist student comprehension. Additionally, as a web‐based resource, it can be used on any device with no need for specific software installation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.