Cyberattacks represent an ever-growing threat that has become a real priority for most organizations. Attackers use sophisticated attack scenarios to deceive defense systems in order to access private data or cause harm. Machine Learning (ML) and Deep Learning (DL) have demonstrate impressive results for detecting cyberattacks due to their ability to learn generalizable patterns from flat data. However, flat data fail to capture the structural behavior of attacks, which is essential for effective detection. Contrarily, graph structures provide a more robust and abstract view of a system that is difficult for attackers to evade. Recently, Graph Neural Networks (GNNs) have become successful in learning useful representations from the semantic provided by graph-structured data. Intrusions have been detected for years using graphs such as network flow graphs or provenance graphs, and learning representations from these structures can help models understand the structural patterns of attacks, in addition to traditional features. In this survey, we focus on the applications of graph representation learning to the detection of network-based and hostbased intrusions, with special attention to GNN methods. For both network and host levels, we present the graph data structures that can be leveraged and we comprehensively review the state-of-the-art papers along with the used datasets. Our analysis reveals that GNNs are particularly efficient in cybersecurity, since they can learn effective representations without requiring any external domain knowledge. We also evaluate the robustness of these techniques based on adversarial attacks. Finally, we discuss the strengths and weaknesses of GNN-based intrusion detection and identify future research directions.INDEX TERMS Cyberattacks, cybersecurity, deep learning (DL), graph neural networks (GNNs), intrusion detection (IDS), machine learning (ML).
Malware detection has become a major concern due to the increasing number and complexity of malware. Traditional detection methods based on signatures and heuristics are used for malware detection, but unfortunately, they suffer from poor generalization to unknown attacks and can be easily circumvented using obfuscation techniques. In recent years, Machine Learning (ML) and notably Deep Learning (DL) achieved impressive results in malware detection by learning useful representations from data and have become a solution preferred over traditional methods. More recently, the application of such techniques on graph-structured data has achieved state-of-the-art performance in various domains and demonstrates promising results in learning more robust representations from malware. Yet, no literature review focusing on graph-based deep learning for malware detection exists. In this survey, we provide an in-depth literature review to summarize and unify existing works under the common approaches and architectures. We notably demonstrate that Graph Neural Networks (GNNs) reach competitive results in learning robust embeddings from malware represented as expressive graph structures, leading to an efficient detection by downstream classifiers. This paper also reviews adversarial attacks that are utilized to fool graph-based detection methods. Challenges and future research directions are discussed at the end of the paper. CCS Concepts: • General and reference → Surveys and overviews; • Security and privacy → Malware and its mitigation; • Computing methodologies → Learning latent representations; Neural networks; Spectral methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.