2020
DOI: 10.1016/j.cola.2020.100979
|View full text |Cite
|
Sign up to set email alerts
|

PathPair2Vec: An AST path pair-based code representation method for defect prediction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 36 publications
(33 citation statements)
references
References 11 publications
0
17
0
Order By: Relevance
“…Then, a bi-directional network with LSTM is used as model. In [22], the authors propose a model for defect prediction on the base of AST path pair representation. To process the code, the path in the AST is extracted as combination of symbol sequence and control sequence.…”
Section: Long Short Term Memorymentioning
confidence: 99%
See 1 more Smart Citation
“…Then, a bi-directional network with LSTM is used as model. In [22], the authors propose a model for defect prediction on the base of AST path pair representation. To process the code, the path in the AST is extracted as combination of symbol sequence and control sequence.…”
Section: Long Short Term Memorymentioning
confidence: 99%
“…It is based on the Synthetic Minority Over-Sampling Technique (SMOTE and SMOTUNED) for preparing the datasets and ensemble approaches for classifying the defective and correct code. In [22], the authors takes into account the proportion of the correct and defective code in each project in the dataset. To balance the classes, they duplicate the elements of the smaller class.…”
Section: Lack Of Datamentioning
confidence: 99%
“…The approach improved the baselines by 3.00, 17.54, 8.77, 14.76 and 8.97%, respectively, on average AUC. Shi et al (2020) built their work based on code2vec, they proposed the PathPair2Vec framework based on Attention Mechanism. The different parts of the terminal node were encoded.…”
Section: E Frameworkmentioning
confidence: 99%
“…To this end, we extend the code2vec model [20] to include paths extracted from various graph representations, mainly AST, CFG, and PDG. We choose code2vec as it is still used for various tasks and most of the recent approaches are built upon it [25], [26]. By extending the code2vec model with CFG and PDG, we can better capture the semantics of the code, which AST alone cannot leverage.…”
Section: Introductionmentioning
confidence: 99%
“…As a first step, we demonstrate that "combining syntactic and semantic paths shows an improvement of 11% over code2vec for the task of METHODNAMING". Moreover, many works are built upon code2vec, such as pathpair2vec [26] and code2seq [25], which outperform code2vec in various tasks. Thus, we believe considering a mocktail of the tree and graph-based structures (AST, CFG, and PDG) can lead to a new direction in representing source code while also improving existing works that rely either solely on ASTs or CFGs and PDGs.…”
Section: Introductionmentioning
confidence: 99%