Beau Coker scite author profile

Beau Coker

5Publications

89Citation Statements Received

58Citation Statements Given

How they've been cited

123

How they cite others

Affiliations

Harvard University

Publications

Order By: Most citations

The Age of Secrecy and Unfairness in Recidivism Prediction

Rudin¹,

Wang²,

Coker³

2020

Harvard Data Science Review

View full text Add to dashboard Cite

and ultimately stagnating because there is no clear definition of fairness and competing definitions are largely incompatible. We argue that the focus on the question of fairness is misplaced, as these algorithms fail to meet a more important and yet readily obtainable goal: transparency. As a result, creators of secret algorithms can provide incomplete or misleading descriptions about how their models work, and various other kinds of errors can easily go unnoticed.By trying to partially reconstruct the COMPAS model -a recidivism-risk scoring model used throughout the criminal justice system -we show that it does not seem to depend linearly on the defendant's age, despite statements to the contrary by the model's creator. This observation has not been made before despite many recently published papers on this algorithm. Furthermore, by subtracting from COMPAS its (hypothesized) nonlinear age component, we show that COMPAS does not necessarily depend on race, contradicting ProPublica's analysis, which assumed linearity in age. In other words, faulty assumptions about a proprietary algorithm lead to faulty conclusions that go unchecked. Were the algorithm transparent in the first place, this likely would

show abstract

The age of secrecy and unfairness in recidivism prediction

Rudin¹,

Wang²,

Coker³

2018

Preprint

View full text Add to dashboard Cite

A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results

2021

View full text Add to dashboard Cite

Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting uncertainty. Any one theory of inference is neither right nor wrong but merely an axiom that may or may not be useful. Each of the many diverse theories of inference can be valuable for certain applications. However, no existing theory of inference addresses the tendency to choose, from the range of plausible data analysis specifications consistent with prior evidence, those that inadvertently favor one’s own hypotheses. Because the biases from these choices are a growing concern across scientific fields, and in a sense the reason the scientific community was invented in the first place, we introduce a new theory of inference designed to address this critical problem. We introduce hacking intervals, which are the range of a summary statistic one may obtain given a class of possible endogenous manipulations of the data. Hacking intervals require no appeal to hypothetical data sets drawn from imaginary superpopulations. A scientific result with a small hacking interval is more robust to researcher manipulation than one with a larger interval and is often easier to interpret than a classical confidence interval. Some versions of hacking intervals turn out to be equivalent to classical confidence intervals, which means they may also provide a more intuitive and potentially more useful interpretation of classical confidence intervals. This paper was accepted by J. George Shanthikumar, big data analytics.

show abstract

Learning a Latent Space of Highly Multidimensional Cancer Data

Kompa

Coker

2019

View full text Add to dashboard Cite

We introduce a Unified Disentanglement Network (UFDN) trained on The Cancer Genome Atlas (TCGA), which we refer to as UFDN-TCGA. We demonstrate that UFDN-TCGA learns a biologically relevant, low-dimensional latent space of high-dimensional gene expression data by applying our network to two classification tasks of cancer status and cancer type. UFDN-TCGA performs comparably to random forest methods. The UFDN allows for continuous, partial interpolation between distinct cancer types. Furthermore, we perform an analysis of differentially expressed genes between skin cutaneous melanoma (SKCM) samples and the same samples interpolated into glioblastoma (GBM). We demonstrate that our interpolations consist of relevant metagenes that recapitulate known glioblastoma mechanisms.

show abstract

Broader Issues Surrounding Model Transparency in Criminal Justice Risk Scoring

Rudin

Wang²,

Coker³

2020

Harvard Data Science Review

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Beau Coker

The Age of Secrecy and Unfairness in Recidivism Prediction

The age of secrecy and unfairness in recidivism prediction

A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results

Learning a Latent Space of Highly Multidimensional Cancer Data

Broader Issues Surrounding Model Transparency in Criminal Justice Risk Scoring

Contact Info

Product

Resources

About