Graph Neural Networks (GNNs) are a popular approach for predicting graph structured data. As GNNs tightly entangle the input graph into the neural network structure, common explainable AI approaches are not applicable. To a large extent, GNNs have remained black-boxes for the user so far. In this paper, we show that GNNs can in fact be naturally explained using higher-order expansions, i.e. by identifying groups of edges that jointly contribute to the prediction. Practically, we find that such explanations can be extracted using a nested attribution scheme, where existing techniques such as layer-wise relevance propagation (LRP) can be applied at each step. The output is a collection of walks into the input graph that are relevant for the prediction. Our novel explanation method, which we denote by GNN-LRP, is applicable to a broad range of graph neural networks and lets us extract practically relevant insights on sentiment analysis of text data, structure-property relationships in quantum chemistry, and image classification.Index Terms-graph neural networks, higher-order explanations, layer-wise relevance propagation, explainable machine learning.
!
INTRODUCTIONMany interesting structures found in scientific and industrial applications can be expressed as graphs. Examples are lattices in fluid modeling, molecular geometry, biological interaction networks, or social / historical networks. Graph neural networks (GNNs) [1], [2] have been proposed as a method to learn from observations in general graph structures and have found use in an ever growing number of applications [3]-[8]. While GNNs make useful predictions, they typically act as black-boxes, and it has neither been directly possible (1) to extract novel insight from the learned model nor (2) to verify that the model has made the intended use of the graph structure, e.g. that it has avoided Clever Hans phenomena [9].Explainable AI (XAI) is an emerging research area that aims to extract interpretable insights from trained ML models [10], [11]. So far, research has focused, for example, on full black-box models [12], [13], self-explainable models [14], [15], or deep neural networks [16], where in all cases, the prediction can be attributed to the input features. For a GNN, however, the graph being received as input is deeply