Abstract:Explainable Artificial Intelligence (XAI) is an emergent research field which tries to cope with the lack of transparency of AI systems, by providing human understandable explanations for the underlying Machine Learning models. This work presents a new explanation extraction method called LEAFAGE. Explanations are provided both in terms of feature importance and of similar classification examples. The latter is a well known strategy for problem solving and justification in social science. LEAFAGE leverages on … Show more
“…Others try to produce explanations more grounded in the features of the data, such as a ranking of features important for the prediction or a decision-tree approximating the model's logic [16,33,46]. However, a growing body of work that has tried to empirically measure the efficacy of many of these methods has shown that they often do not actually affect or improve human decision-making [2,8,30,43], and in practice are primarily used for internal model debugging [6].…”
Section: Related Work 21 Interpretability Methods For Human Understan...mentioning
Figure 1: An example of the proposed interface for an electrocardiogram (ECG) case study. The output of the machine learning model consists of raw and aggregate information about the input's nearest neighbors. With the editor in the top right, the user can apply meaningful manipulations to the input and see how the output changes.
“…Others try to produce explanations more grounded in the features of the data, such as a ranking of features important for the prediction or a decision-tree approximating the model's logic [16,33,46]. However, a growing body of work that has tried to empirically measure the efficacy of many of these methods has shown that they often do not actually affect or improve human decision-making [2,8,30,43], and in practice are primarily used for internal model debugging [6].…”
Section: Related Work 21 Interpretability Methods For Human Understan...mentioning
Figure 1: An example of the proposed interface for an electrocardiogram (ECG) case study. The output of the machine learning model consists of raw and aggregate information about the input's nearest neighbors. With the editor in the top right, the user can apply meaningful manipulations to the input and see how the output changes.
“…2) CF model agnostic interpretability methods: Among other model-agnostic approaches that were tested with reported results on tree ensemble models, we can mention the LORE [23], LEAFAGE [24], and CLEAR [25] approaches. The LORE approach uses local interpretable surrogates in order to derive sets of CF rules.…”
Section: B Cf Example-based Interpretability Methodsmentioning
Explaining the decisions of complex machine learning models is becoming a necessity in many areas where trust in ML models decision is essential to their accreditation / adoption by the experts of the domain. The ability to explain models decisions comes also with an added value since it allows to provide diagnosis in addition to the model decision. This is highly valuable in such scenarios as fault/abnormality detection. Unfortunately, high-performance models do not exhibit the necessary transparency to make their decisions fully understandable. And the black-boxes approaches, which are used to explain such model decisions, suffer from a lack of accuracy in tracing back the exact cause of a model decision regarding a given input. Indeed, they do not have the ability to explicitly describe the decision regions of the model around that input, which would be necessary to exactly say what influences the model towards one decision or the other. We thus asked ourselves the question: is there a category of high-performance models among the ones commonly used for which we could explicitly and exactly characterise the decision regions in the input feature space using a geometrical characterisation? Surprisingly we came out with a positive answer for any model that enters the category of tree ensemble models, which encompasses a wide range of models dedicated to massive heterogeneous industrial data processing such as XGBoost, Catboost, Lightgbm, random forests... For these models, we could derive an exact geometrical characterisation of the decision regions under the form of a collection of multidimensional intervals. This characterisation makes it straightforward to compute the optimal counterfactual (CF) example associated with a query point, as well as the geometrical characterisation of the entire decision region of the model containing the optimal CF example. We also demonstrate other possibilities of the approach such as computing the CF example based only on a subset of features, and fixing the values of variables on which the user has no control. This allows in general to obtain more plausible explanations by integrating some prior knowledge about the problem.A straightforward adaptation of the method to counterfactual reasoning on regression problems is also envisaged.
“…It was important to evaluate the visualization from the perspective of the end-users, an aspect often brushed over in XAI studies [56,59]. The explanation was specifically designed for DNA experts within the context of NOC estimation, so we invited DNA experts from the NFI to participate in a user study.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.