With the increase in the use of AI systems, a need for explanation systems arises. Building an explanation system requires a definition of explanation. However, the natural language term explanation is difficult to define formally as it includes multiple perspectives from different domains such as psychology, philosophy, and cognitive sciences. We study multiple perspectives and aspects of explainability of recommendations or predictions made by AI systems, and provide a generic definition of explanation. The proposed definition is ambitious and challenging to apply. With the intention to bridge the gap between theory and application, we also propose a possible architecture of an automated explanation system based on our definition of explanation.
Generating explanations of predictions made by machine learning models is a difficult task, especially for black-box models. One possible way to explain an individual decision or recommendation for a given instance is to build an interpretable local surrogate for the underlying black-box model in the vicinity of the given instance. This approach has been adopted by many algorithms, for example LIME and LEAFAGE. These algorithms suffer from shortcomings, strict assumptions and prerequisites, which not only limit their applicability but also affect black-box fidelity of their local approximations. We present ways to overcome their shortcomings including the definition of neighborhood, removal of prerequisites and assumption of linearity in local model. The main contribution of this paper is a novel algorithm (LEMP) which provides explanation for the given instance by building a surrogate model using generated perturbations in the neighborhood of the given instance as training data. Experiments show that our approach is more widely applicable and generates interpretable models with better fidelity to the underlying black-box model than previous algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.