Interpretable machine learning: definitions, methods, and applications

Murdoch, W. James; Singh, Chandan; Kumbier, Karl; Abbasi-Asl, Reza; Yu, Bin

doi:10.48550/arxiv.1901.04592

Cited by 90 publications

(126 citation statements)

References 66 publications

Supporting

Mentioning

121

Contrasting

Unclassified

Order By: Relevance

“…Explainability and interpretability are topics of growing interest in the machine learning community [Ribeiro et al, 2016, Lundberg and Lee, 2017, Adadi and Berrada, 2018, Rudin, 2019, Murdoch et al, 2019, Molnar, 2020. While there has been some focus on what Dasgupta et al [2020] calls postmodeling explainability, or the ability to explain the output of a black-box model [Ribeiro et al, 2016, Lundberg and Lee, 2017, Kauffmann et al, 2019, the practice has also been criticized in contrast with pre-modelling explainability, or the use of interpretable models to begin with [Rudin, 2019].…”

Section: Preliminaries and Problem Definitionmentioning

confidence: 99%

Shallow decision trees for explainable $k$-means clustering

Laber¹,

Murtinho²,

Oliveira³

2021

Preprint

View full text Add to dashboard Cite

A number of recent works have employed decision trees for the construction of explainable partitions that aim to minimize the k-means cost function. These works, however, largely ignore metrics related to the depths of the leaves in the resulting tree, which is perhaps surprising considering how the explainability of a decision tree depends on these depths. To fill this gap in the literature, we propose an efficient algorithm that takes into account these metrics. In experiments on 16 datasets, our algorithm yields better results than decision-tree clustering algorithms such as the ones presented in Dasgupta et al. [2020, Laber and Murtinho [2021] and Makarychev and Shan [2021a], typically achieving lower or equivalent costs with considerably shallower trees. We also show, through a simple adaptation of existing techniques, that the problem of building explainable partitions induced by binary trees for the k-means cost function does not admit an (1 + )-approximation in polynomial time unless P = N P , which justifies the quest for approximation algorithms and/or heuristics.

show abstract

Section: Preliminaries and Problem Definitionmentioning

confidence: 99%

Shallow decision trees for explainable $k$-means clustering

Laber¹,

Murtinho²,

Oliveira³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Deep learning systems have been adopted in many areas from medicine to autonomous driving (Ahmad et al, 2018;Claybrook and Kildare, 2018) and as these algorithms are incorporated, the need for explainable and transparent models becomes more urgent. One approach researchers use to overcome the inherent ambiguity of these black-box methods is to develop additional models to learn and explain the decisions of existing models, analyze when these models fail, and introduce a human-in-the-loop component to improve performance (Ribeiro et al, 2016;Murdoch et al, 2019;Poursabzi-Sangdeh et al, 2018). Another way to tackle this challenge is to design models with a goal of interpretability in place when development starts (Ridgeway et al, 1998;Rudin, 2018;Gilpin et al, 2018;Lahav et al, 2018;Hooker et al, 2019).…”

Section: Related Workmentioning

confidence: 99%

Towards Trustworthy Deception Detection: Benchmarking Model Robustness across Domains, Modalities, and Languages

Glenski,

Ayton,

Cosbey

et al. 2021

Preprint

View full text Add to dashboard Cite

Evaluating model robustness is critical when developing trustworthy models not only to gain deeper understanding of model behavior, strengths, and weaknesses, but also to develop future models that are generalizable and robust across expected environments a model may encounter in deployment. In this paper we present a framework for measuring model robustness for an important but difficult text classification task -deceptive news detection. We evaluate model robustness to out-of-domain data, modality-specific features, and languages other than English.Our investigation focuses on three type of models: LSTM models trained on multiple datasets (Cross-Domain), several fusion LSTM models trained with images and text and evaluated with three state-of-the-art embeddings, BERT ELMo, and GloVe (Cross-Modality), and characterlevel CNN models trained on multiple languages (Cross-Language). Our analyses reveal a significant drop in performance when testing neural models on out-of-domain data and non-English languages that may be mitigated using diverse training data. We find that with additional image content as input, ELMo embeddings yield significantly fewer errors compared to BERT or GLoVe. Most importantly, this work not only carefully analyzes deception model robustness but also provides a framework of these analyses that can be applied to new models or extended datasets in the future.

show abstract

“…While post-modeling explainability focuses on giving reasoning behind decisions made by black box models, pre-modeling explainability deals with ML systems that are inherently understandable or perceivable by humans. One of the canonical approaches to pre-modelling explainability builds on decision trees [35,37]. In fact, a significant amount of work on explainable clustering is based on unsupervised decision trees [3,17,20,21,29,36].…”

Section: Introductionmentioning

confidence: 99%

How to Find a Good Explanation for Clustering?

Bandyapadhyay¹,

Fomin²,

Golovach³

et al. 2021

Preprint

View full text Add to dashboard Cite

k-means and k-median clustering are powerful unsupervised machine learning techniques. However, due to complicated dependences on all the features, it is challenging to interpret the resulting cluster assignments. Moshkovitz, Dasgupta, Rashtchian, and Frost [ICML 2020] proposed an elegant model of explainable k-means and k-median clustering. In this model, a decision tree with k leaves provides a straightforward characterization of the data set into clusters.We study two natural algorithmic questions about explainable clustering.(1) For a given clustering, how to find the "best explanation" by using a decision tree with k leaves? (2) For a given set of points, how to find a decision tree with k leaves minimizing the k-means/median objective of the resulting explainable clustering? To address the first question, we introduce a new model of explainable clustering. Our model, inspired by the notion of outliers in robust statistics, is the following. We are seeking a small number of points (outliers) whose removal makes the existing clustering well-explainable. For addressing the second question, we initiate the study of the model of Moshkovitz et al. from the perspective of multivariate complexity. Our rigorous algorithmic analysis sheds some light on the influence of parameters like the input size, dimension of the data, the number of outliers, the number of clusters, and the approximation ratio, on the computational complexity of explainable clustering.

show abstract

Interpretable machine learning: definitions, methods, and applications

Cited by 90 publications

References 66 publications

Shallow decision trees for explainable $k$-means clustering

Shallow decision trees for explainable $k$-means clustering

Towards Trustworthy Deception Detection: Benchmarking Model Robustness across Domains, Modalities, and Languages

How to Find a Good Explanation for Clustering?

Contact Info

Product

Resources

About