We present RegressionExplorer, a Visual Analytics tool for the interactive exploration of logistic regression models. Our application domain is Clinical Biostatistics, where models are derived from patient data with the aim to obtain clinically meaningful insights and consequences. Development and interpretation of a proper model requires domain expertise and insight into model characteristics. Because of time constraints, often a limited number of candidate models is evaluated. RegressionExplorer enables experts to quickly generate, evaluate, and compare many different models, taking the workflow for model development as starting point. Global patterns in parameter values of candidate models can be explored effectively. In addition, experts are enabled to compare candidate models across multiple subpopulations. The insights obtained can be used to formulate new hypotheses or to steer model development. The effectiveness of the tool is demonstrated for two uses cases: prediction of a cardiac conduction disorder in patients after receiving a heart valve implant and prediction of hypernatremia in critically ill patients.
Businesses in high-risk environments have been reluctant to adopt modern machine learning approaches due to their complex and uninterpretable nature. Most current solutions provide local, instance-level explanations, but this is insufficient for understanding the model as a whole. In this work, we show that strategy clusters (i.e., groups of data instances that are treated distinctly by the model) can be used to understand the global behavior of a complex ML model. To support effective exploration and understanding of these clusters, we introduce STRATEGYATLAS, a system designed to analyze and explain model strategies. Furthermore, it supports multiple ways to utilize these strategies for simplifying and improving the reference model. In collaboration with a large insurance company, we present a use case in automatic insurance acceptance, and show how professional data scientists were enabled to understand a complex model and improve the production model based on these insights.
Fig. 1: A screenshot of our STBins prototype that shows Flickr user activities in 7 hours during the anti-austerity movement in Spain named 15M: (a) the main view that presents Flickr user behavior, (b) the control panel that allows control of time flow and visual design, and (c) the data view that presents details of selection in the main view.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.