Assessing the predictive accuracy of black box classifiers is challenging in the absence of labeled test datasets. In these scenarios we may need to rely on a human oracle to evaluate individual predictions; presenting the challenge to create query algorithms to guide the search for points that provide the most information about the classifier's predictive characteristics. Previous works have focused on developing utility models and query algorithms for discovering unknown unknowns -misclassifications with a predictive confidence above some arbitrary threshold. However, if misclassifications occur at the rate reflected by the confidence values, then these search methods reveal nothing more than a proper assessment of predictive certainty. We are unable to properly mitigate the risks associated with model deficiency when the model's confidence in prediction exceeds the actual model accuracy. We propose a facility locations utility model and corresponding greedy query algorithm that instead searches for overconfident unknown unknowns. Through robust empirical experiments we demonstrate that the greedy query algorithm with the facility locations utility model consistently results in oracle queries with superior performance in discovering overconfident unknown unknowns than previous methods.
Data subset selection is a crucial task in deploying machine learning algorithms under strict constraints regarding memory and computation resources. Despite extensive research in this area, a practical difficulty is the lack of rigorous strategies for identifying the optimal size of the reduced data to regulate trade-offs between accuracy and efficiency. Furthermore, existing methods are often built around specific machine learning models, and translating existing theoretical results into practice is challenging for practitioners. To address these problems, we propose two adaptive compression algorithms for classification problems by formulating data subset selection in the form of interactive teaching. The user interacts with the learning task at hand to adapt to the unique structure of the problem at hand, developing an iterative importance sampling scheme. We also propose to couple importance sampling and a diversity criterion to further control the evolution of the data summary over the rounds of interaction. We conduct extensive experiments on several data sets, including imbalanced and multiclass data, and various classification algorithms, such as ensemble learning and neural networks. Our results demonstrate the performance, efficiency, and ease of implementation of the underlying framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.