Teaching people clever heuristics is a promising approach to improve decision-making under uncertainty. The theory of resource rationality makes it possible to leverage machine learning to discover optimal heuristics automatically. One bottleneck of this approach is that the resulting decision strategies are only as good as the model of the decision problem that the machine learning methods were applied to. This is problematic because even domain experts cannot give complete and fully accurate descriptions of the decisions they face. To address this problem, we develop strategy discovery methods that are robust to potential inaccuracies in the description of the scenarios in which people will use the discovered decision strategies. The basic idea is to derive the strategy that will perform best in expectation across all possible real-world problems that could have given rise to the likely erroneous description that a domain expert provided. To achieve this, our method uses a probabilistic model of how the description of a decision problem might be corrupted by biases in human judgment and memory. Our method uses this model to perform Bayesian inference on which real-world scenarios might have given rise to the provided descriptions. We applied our Bayesian approach to robust strategy discovery in two domains: planning and risky choice. In both applications, we find that our approach is more robust to errors in the description of the decision problem and that teaching the strategies it discovers significantly improves human decision-making in scenarios where approaches ignoring the risk that the description might be incorrect are ineffective or even harmful. The methods developed in this article are an important step towards leveraging machine learning to improve human decision-making in the real world because they tackle the problem that the real world is fundamentally uncertain.
Machine learning (ML) is becoming a standard tool in neuroscience and neuroimaging research. Yet, because it is such a powerful tool, the appropriate application of ML requires a sound understanding of its subtleties and limitations. In particular, applying ML to datasets with imbalanced classes, which are very common in neuroscience, can have severe consequences if not adequately addressed. With the neuroscience machine-learning user in mind, this technical note provides a didactic overview of the class imbalance problem and illustrates its impact through systematic manipulation of class imbalance ratios in both simulated data, and real electroencephalography (EEG) and magnetoencephalography (MEG) brain data. Our results illustrate how in highly imbalanced data, the commonly used Accuracy (Acc) metric yields misleadingly high performances by preferentially predicting the majority class, while other evaluations metrics (e.g. Balanced Accuracy (BAcc) and the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC)) may still provide reliable performance evaluations. In terms of classifiers and cross-validation schemes, our data highlights the higher robustness of Random Forest (RF) and Stratified K-Fold crossvalidation, compared to the other approaches tested. Critically, for neuroscience ML applications that seek to minimize overall classification error (not preferentially that of a single class), we recommend the routine use of BAcc, rather than the simple and more commonly used Acc metric. Importantly, we provide a best practices list of recommendations for dealing with imbalanced data, and open-source code to allow the neuroscience community to replicate our observations and further explore the best practices in handling imbalanced data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.