2020
DOI: 10.1002/smr.2330
|View full text |Cite
|
Sign up to set email alerts
|

MPT‐embedding: An unsupervised representation learning of code for software defect prediction

Abstract: Software project defect prediction can help developers allocate debugging resources. Existing software defect prediction models are usually based on machine learning methods, especially deep learning. Deep learning‐based methods tend to build end‐to‐end models that directly use source code‐based abstract syntax trees (ASTs) as input. They do not pay enough attention to the front‐end data representation. In this paper, we propose a new framework to represent source code called multiperspective tree embedding (M… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
18
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(18 citation statements)
references
References 48 publications
0
18
0
Order By: Relevance
“…• MPT [6]: This approach to defect prediction uses multiperspective tree embedding to learn the representation an AST in an unsupervised manner.…”
Section: Baseline Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…• MPT [6]: This approach to defect prediction uses multiperspective tree embedding to learn the representation an AST in an unsupervised manner.…”
Section: Baseline Methodsmentioning
confidence: 99%
“…The data set used in this work consists of source code from seven Java Apache projects collected from PROMISE, 6 a publicly accessible repository of SDP research data collected by Jureczko and Madeyski [34]. Specifically, each project version from PROMISE is represented by a list of classes it consists of, and each class is described by 20 traditional code features, such as lines of code, and the defect label.…”
Section: A Data Setmentioning
confidence: 99%
See 2 more Smart Citations
“…The proposed framework achieved a better F-measure by 0.532 on average, compared with the selected baselines. Besides, Shi et al (2021) represented the code by different representations. The most crucial information of nodes was coded.…”
Section: E Frameworkmentioning
confidence: 99%