Classification models are often used to make decisions that affect humans: whether to approve a loan application, extend a job offer, or provide insurance. In such applications, individuals should have the ability to change the decision of the model. When a person is denied a loan by a credit scoring model, for example, they should be able to change the input variables of the model in a way that will guarantee approval. Otherwise, this person will be denied the loan so long as the model is deployed, and -more importantlywill lack agency over a decision that affects their livelihood.In this paper, we propose to audit a linear classification model in terms of recourse, which we define as the ability of a person to change the decision of the model through actionable input variables (e.g., income vs. gender, age, or marital status). We present an integer programming toolkit to: (i) measure the feasibility and difficulty of recourse in a target population; and (ii) generate a list of actionable changes for an individual to obtain a desired outcome. We demonstrate how our tools can inform practitioners, policymakers, and consumers by auditing credit scoring models built using real-world datasets. Our results illustrate how recourse can be significantly impacted by common modeling practices, and motivate the need to guarantee recourse as a policy objective for regulation in algorithmic decision-making. arXiv:1809.06514v1 [stat.ML]
The new ADHD screening scale is short, easily scored, detects the vast majority of general population cases at a threshold that also has high specificity and PPV, and could be used as a screening tool in specialty treatment settings.
Scoring systems are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are in widespread use by the medical community, but are difficult to learn from data because they need to be accurate and sparse, have coprime integer coefficients, and satisfy multiple operational constraints. We present a new method for creating data-driven scoring systems called a Supersparse Linear Integer Model (SLIM). SLIM scoring systems are built by using an integer programming problem that directly encodes measures of accuracy (the 0-1 loss) and sparsity (the 0 -seminorm) while restricting coefficients to coprime integers. SLIM can seamlessly incorporate a wide range of operational constraints related to accuracy and sparsity, and can produce acceptable models without parameter tuning because of the direct control provided over these quantities. We provide bounds on the testing and training accuracy of SLIM scoring systems, and present a new data reduction technique that can improve scalability by eliminating a portion of the training data beforehand. Our paper includes results from a collaboration with the Massachusetts General Hospital Sleep Laboratory, where SLIM is being used to create a highly tailored scoring system for sleep apnea screening. Electronic supplementary material The online version of this article
These authors contributed equally to this work. Summary.We investigate a long-debated question, which is how to create predictive models of recidivism that are sufficiently accurate, transparent, and interpretable to use for decision-making. This question is complicated as these models are used to support different decisions, from sentencing, to determining release on probation, to allocating preventative social services. Each case might have an objective other than classification accuracy, such as a desired true positive rate (TPR) or false positive rate (FPR). Each (TPR, FPR) pair is a point on the receiver operator characteristic (ROC) curve. We use popular machine learning methods to create models along the full ROC curve on a wide range of recidivism prediction problems. We show that many methods (SVM, SGB, Ridge Regression) produce equally accurate models along the full ROC curve. However, methods that designed for interpretability (CART, C5.0) cannot be tuned to produce models that are accurate and/or interpretable. To handle this shortcoming, we use a recent method called Supersparse Linear Integer Models (SLIM) to produce accurate, transparent, and interpretable scoring systems along the full ROC curve. These scoring systems can be used for decision-making for many different use cases, since they are just as accurate as the most powerful black-box machine learning models for many applications, but completely transparent, and highly interpretable.
IMPORTANCE Continuous electroencephalography (EEG) use in critically ill patients is expanding. There is no validated method to combine risk factors and guide clinicians in assessing seizure risk.OBJECTIVE To use seizure risk factors from EEG and clinical history to create a simple scoring system associated with the probability of seizures in patients with acute illness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright 漏 2024 scite LLC. All rights reserved.
Made with 馃挋 for researchers
Part of the Research Solutions Family.