Efficient Deep Features Learning for Vulnerability Detection Using Character N-Gram Embedding

Alenezi, Mamdouh; Zagane, Mohammed; Javed, Yasir

doi:10.5455/jjcit.71-1597824949

Cited by 3 publications

(5 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Feature Types Under Text-based Code Representation. In the context of text-based code representation, several feature types have been identified that can influence the model's behavior when processing source code, including token type [31,48], token length [50], token frequency [49], token n-grams [51], token lexical patterns [52], and token attention values [53]. Token types could be categorized as comments and code.…”

Section: Taxonomy Of Related Workmentioning

confidence: 99%

“…Zeng et al [49] concluded that a better model performance is achieved when preserving code frequency information. Token n-grams are fixed-size contiguous sequences of tokens that capture local context within a fixed window [51], but their effectiveness may be limited for longer code sequences and transformer models. By representing recurring structures in the code [52] token lexical patterns can help understand the code's basic logic and structure.…”

Section: Taxonomy Of Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Assessment of Software Vulnerability Contributing Factors by Model-Agnostic Explainable AI

Li,

Liu,

Huang

2024

MAKE

View full text Add to dashboard Cite

Software vulnerability detection aims to proactively reduce the risk to software security and reliability. Despite advancements in deep-learning-based detection, a semantic gap still remains between learned features and human-understandable vulnerability semantics. In this paper, we present an XAI-based framework to assess program code in a graph context as feature representations and their effect on code vulnerability classification into multiple Common Weakness Enumeration (CWE) types. Our XAI framework is deep-learning-model-agnostic and programming-language-neutral. We rank the feature importance of 40 syntactic constructs for each of the top 20 distributed CWE types from three datasets in Java and C++. By means of four metrics of information retrieval, we measure the similarity of human-understandable CWE types using each CWE type’s feature contribution ranking learned from XAI methods. We observe that the subtle semantic difference between CWE types occurs after the variation in neighboring features’ contribution rankings. Our study shows that the XAI explanation results have approximately 78% Top-1 to 89% Top-5 similarity hit rates and a mean average precision of 0.70 compared with the baseline of CWE similarity identified by the open community experts. Our framework allows for code vulnerability patterns to be learned and contributing factors to be assessed at the same stage.

show abstract

Section: Taxonomy Of Related Workmentioning

confidence: 99%

Section: Taxonomy Of Related Workmentioning

confidence: 99%

Assessment of Software Vulnerability Contributing Factors by Model-Agnostic Explainable AI

Li,

Liu,

Huang

2024

MAKE

View full text Add to dashboard Cite

show abstract

“…Code gadget is in turn generated using data flow analysis of the source code. [68] The system uses deep learning to extract the features and use deep learning algorithms to for prediction of vulnerabilities. The work uses C/ C++ source codes which is tokenized at slice level.…”

Section: ) Review Of Related Workmentioning

confidence: 99%

“…The two most common vulnerabilities considered from dataset NVD and SARD for the research works, [34] and [65] are buffer overflow identified with CWE-119 and resource management error (CWE-399). [68], [33], [59] uses collected and labelled C/C++ program source code from [45], [43] which is publicly available in SySeVR dataset [54]. The collected instances contain labelled instances with vulnerability type such as Function call (FC), Array Usage (AU), Pointer Usage (PU), Arithmetic Expression (AE).…”

Section: Review Of Dataset Considered For Targeted Vulnerabilitiesmentioning

confidence: 99%

Developer’s Roadmap to Design Software Vulnerability Detection Model Using Different AI Approaches

2022

View full text Add to dashboard Cite

Automatic software vulnerability detection has caught the eyes of researchers as because software vulnerabilities are exploited vehemently causing major cyber-attacks. Thus, designing a vulnerability detector is an inevitable approach to eliminate vulnerabilities. With the advances of Natural language processing in the application of interpreting source code as text, AI approaches based on Machine Learning, Deep Learning and Graph Neural Network have impactful research works. The key requirement for developing an AI based vulnerability detector model from a developer perspective is to identify which AI model to adopt, availability of labelled dataset, how to represent essential feature and tokenizing the extracted feature vectors, specification of vulnerability coverage with detection granularity. Most of the literature review work explores AI approaches based on either Machine Learning or Deep Learning model. The existing literature work either highlight only feature representation technique or identifying granularity level and dataset. A qualitative comparative analysis on ML, DL, GNN based model is presented in this work to get a complete picture on VDM thus addressing the challenges of a researcher to choose suitable architecture, feature representation and processing required for designing a VDM. This work focuses on putting together all the essential bits required for designing an automated software vulnerability detection model using any various AI approaches.

show abstract

“…The application of ECS can effectively complete relevant online operations and improve the convenience of people's daily life. Due to the special computing mode and impact of ECS, there are some potential risk elements [4][5][6][7] during its operation, which affect user information security and are not conducive to the optimization and upgrade of ECS. Therefore, vulnerability detection needs to be effectively conducted to reduce the risk of dual end operation of ECS.…”

Section: Introductionmentioning

confidence: 99%

Research on cloud server double end vulnerability detection based on heuristic genetic algorithm

Lu,

Cen,

Ren

et al. 2023

Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023)

View full text Add to dashboard Cite

Conventional cloud server dual end vulnerability detection methods mainly use P2P (peer-to-peer) structure to generate dual end vulnerability detection programs, which are vulnerable to the dual identities of the server and the client, resulting in relatively high vulnerability detection errors. Therefore, a new cloud server dual end vulnerability detection method needs to be designed based on heuristic genetic algorithm. That is to say, the heuristic genetic algorithm is used to build a cloud server dual end vulnerability detection model, and the cloud server dual end vulnerability detection middleware is designed, thus realizing the cloud server dual end vulnerability detection. The experimental results show that the designed dual end vulnerability detection method for cloud server based on heuristic genetic algorithm has low relative error of vulnerability detection, which proves that the designed vulnerability detection method has good detection effect, reliability, and certain application value, and has made certain contributions to improving the security of cloud server operation.

show abstract

Efficient Deep Features Learning for Vulnerability Detection Using Character N-Gram Embedding

Cited by 3 publications

References 0 publications

Assessment of Software Vulnerability Contributing Factors by Model-Agnostic Explainable AI

Assessment of Software Vulnerability Contributing Factors by Model-Agnostic Explainable AI

Developer’s Roadmap to Design Software Vulnerability Detection Model Using Different AI Approaches

Research on cloud server double end vulnerability detection based on heuristic genetic algorithm

Contact Info

Product

Resources

About