Vulnerability Prediction From Source Code Using Machine Learning

Bilgin, Zeki; Ersoy, Mehmet Akif; Soykan, Elif Üstündağ; Tomur, Emrah; Çomak, Pınar; Karaçay, Leyli

doi:10.1109/access.2020.3016774

Cited by 56 publications

(41 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The application of data-driven techniques in security assurance may provide a promising solution in automated and intelligent security analysis, including vulnerability identification, code classification, vulnerability prediction, code summarizing, and clone detection. In the literature, limited research work can be found in this area [9,61,91].…”

Section: Data Driven Security Assurance Methodsmentioning

confidence: 99%

“…In the literature, machine learning methods are also applied to code complexities, code churn, token frequency, developer activities, etc., to detect vulnerabilities towards enhancing security assurance. Bilgin et al [9] used the machine learning technique to predict the vulnerability of the software from source code before its release. This work also includes developing a source code representation method, intelligently analyzing the abstract syntax tree (AST) form of the source code, and then verifying whether ML can be applied to distinguish between vulnerable and non-vulnerable code fragments.…”

Section: Vulnerability Detection/prediction Approachmentioning

confidence: 99%

“…Another future direction is identifying the exact location of detected vulnerability in the functional level code and what it is the reason behind its detection as a vulnerability. In other words, considering and improving the localization and interpretation aspects of the vulnerability [9].…”

Section: Vulnerability Assessmentmentioning

confidence: 99%

See 2 more Smart Citations

System Security Assurance: A Systematic Literature Review

Shukla¹,

Katt²,

Nweke³

et al. 2021

Preprint

View full text Add to dashboard Cite

Security assurance provides the confidence that security features, practices, procedures, and architecture of software systems mediate and enforce the security policy and are resilient against security failure and attacks. Alongside the significant benefits of security assurance, the evolution of new information and communication technology (ICT) introduces new challenges regarding information protection. Security assurance methods based on the traditional tools, techniques, and procedures may fail to account new challenges due to poor requirement specifications, static nature, and poor development processes. The common criteria (CC) commonly used for security evaluation and certification process also comes with many limitations and challenges. In this paper, extensive efforts have been made to study the state-of-the-art, limitations and future research directions for security assurance of the ICT and cyber-physical systems (CPS) in a wide range of domains. We systematically review the requirements, processes, and activities involved in system security assurance including security requirements, security metrics, system and environments and assurance methods. We shed light on the challenges and gaps that have been identified by the existing literature related to system security assurance and corresponding solutions. Finally, we discussed the limitations of the present methods and future research directions.CCS Concepts: • General and reference → Surveys and overviews; • Security and privacy → Systems security.

show abstract

Section: Data Driven Security Assurance Methodsmentioning

confidence: 99%

Section: Vulnerability Detection/prediction Approachmentioning

confidence: 99%

See 1 more Smart Citation

System Security Assurance: A Systematic Literature Review

Shukla¹,

Katt²,

Nweke³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Dataset preparation: Authors used existing labeled datasets as well as created their own datasets to train ml models. Specifically, a set of studies [48,156,219,243,254,263,298] used available labeled datasets for php, Java, C, C++, and Android applications to train vulnerability detection models. In other cases, Russell et al [261] extended an existing dataset with millions of C and C++ functions and then labeled it based on the output of three static analyzers (i.e., Clang, CppCheck, and Flawfinder).…”

Section: Vulnerability Analysismentioning

confidence: 99%

A Survey on Machine Learning Techniques for Source Code Analysis

Sharma¹,

Kechagia²,

Georgiou³

et al. 2021

Preprint

View full text Add to dashboard Cite

Context:The advancements in machine learning techniques have encouraged researchers to apply these techniques to a myriad of software engineering tasks that use source code analysis such as testing and vulnerabilities detection. A large number of studies poses challenges to the community to understand the current landscape. Objective: We aim to summarize the current knowledge in the area of applied machine learning for source code analysis. Method: We investigate studies belonging to twelve categories of software engineering tasks and corresponding machine learning techniques, tools, and datasets that have been applied to solve them. To do so, we carried out an extensive literature search and identified 364 primary studies published between 2002 and 2021. We summarize our observations and findings with the help of the identified studies. Results: Our findings suggest that the usage of machine learning techniques for source code analysis tasks is consistently increasing. We synthesize commonly used steps and the overall workflow for each task, and summarize the employed machine learning techniques. Additionally, we collate a comprehensive list of available datasets and tools useable in this context. Finally, we summarize the perceived challenges in this area that include availability of standard datasets, reproducibility and replicability, and hardware resources. CCS Concepts: • Software and its engineering → Software libraries and repositories; Software maintenance tools; Software post-development issues; Maintaining software; • Computing methodologies → Machine learning.

show abstract

“…The study in [124] built a model to predict software vulnerabilities of codes using ML before releasing the code. After developing a source code representation using AST and intelligently analysing it, the ML models were applied.…”

Section: Applying ML To Detect Source Code Vulnerabilitiesmentioning

confidence: 99%

Android Mobile Malware Detection Using Machine Learning: A Systematic Review

2021

View full text Add to dashboard Cite

With the increasing use of mobile devices, malware attacks are rising, especially on Android phones, which account for 72.2% of the total market share. Hackers try to attack smartphones with various methods such as credential theft, surveillance, and malicious advertising. Among numerous countermeasures, machine learning (ML)-based methods have proven to be an effective means of detecting these attacks, as they are able to derive a classifier from a set of training examples, thus eliminating the need for an explicit definition of the signatures when developing malware detectors. This paper provides a systematic review of ML-based Android malware detection techniques. It critically evaluates 106 carefully selected articles and highlights their strengths and weaknesses as well as potential improvements. Finally, the ML-based methods for detecting source code vulnerabilities are discussed, because it might be more difficult to add security after the app is deployed. Therefore, this paper aims to enable researchers to acquire in-depth knowledge in the field and to identify potential future research and development directions.

show abstract

Vulnerability Prediction From Source Code Using Machine Learning

Cited by 56 publications

References 21 publications

System Security Assurance: A Systematic Literature Review

System Security Assurance: A Systematic Literature Review

A Survey on Machine Learning Techniques for Source Code Analysis

Android Mobile Malware Detection Using Machine Learning: A Systematic Review

Contact Info

Product

Resources

About