In recent years, graph neural networks (GNNs) have emerged as a powerful neural architecture to learn vector representations of nodes and graphs in a supervised, end-to-end fashion. Up to now, GNNs have only been evaluated empirically-showing promising results. The following work investigates GNNs from a theoretical point of view and relates them to the 1-dimensional Weisfeiler-Leman graph isomorphism heuristic (1-WL). We show that GNNs have the same expressiveness as the 1-WL in terms of distinguishing non-isomorphic (sub-)graphs. Hence, both algorithms also have the same shortcomings. Based on this, we propose a generalization of GNNs, so-called k-dimensional GNNs (k-GNNs), which can take higher-order graph structures at multiple scales into account. These higher-order structures play an essential role in the characterization of social networks and molecule graphs. Our experimental evaluation confirms our theoretical findings as well as confirms that higher-order information is useful in the task of graph classification and regression.completing the equivalence. Since the power of the 1-WL has been completely characterized, see, e.g., (Arvind et al. 2015;Kiefer, Schweitzer, and Selman 2015), we can transfer these results to the case of GNNs, showing that both approaches have the same shortcomings.Going further, we leverage these theoretical relationships to propose a generalization of GNNs, called k-GNNs, which are neural architectures based on the k-dimensional WL algorithm (k-WL), which are strictly more powerful than GNNs. The key insight in these higher-dimensional variants is that they perform message passing directly between subgraph structures, rather than individual nodes. This higher-order form of message passing can capture structural information that is not visible at the node-level.Graph kernels based on the k-WL have been proposed in the past (Morris, Kersting, and Mutzel 2017). However, a key advantage of implementing higher-order message passing in GNNs-which we demonstrate here-is that we can design hierarchical variants of k-GNNs, which combine graph representations learned at different granularities in an end-to-end trainable framework. Concretely, in the presented hierarchical approach the initial messages in a k-GNN are based on the output of lower-dimensional k -GNN (with k < k), which allows the model to effectively capture graph structures of varying granularity. Many real-world graphs inherit a hierarchical structure-e.g., in a social network we must model both the ego-networks around individual nodes, as well as the coarse-grained relationships between entire communities, see, e.g., (Newman 2003)-and our experimental results demonstrate that these hierarchical k-GNNs are able to consistently outperform traditional GNNs on a variety of graph classification and regression tasks. Across twelve graph regression tasks from the QM9 benchmark, we find that our hierarchical model reduces the mean absolute error by 54.45% on average. For graph classification, we find that our hierarchical models...