Multiple‐components weights model for cross‐project software defect prediction

Qiu, Shaojian; Lu, Lu; Jiang, Siyu

doi:10.1049/iet-sen.2017.0111

Cited by 20 publications

(16 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Chen et al [24] first initialized the weights of source project data by the data gravitation method and adjusted them with a limited amount of labelled data in the target project by building a prediction model named TrAdaboost [25]. Qiu et al [26] constructed a novel multiple-components weights learning model with the kernel mean matching (KMM) algorithm. It divides the source project data into several components, and KMM is applied to adjust the source-instance weights in each component.…”

Section: Related Workmentioning

confidence: 99%

Software Defect Prediction via Attention-Based Recurrent Neural Network

Fan

Diao

et al. 2019

Scientific Programming

View full text Add to dashboard Cite

In order to improve software reliability, software defect prediction is applied to the process of software maintenance to identify potential bugs. Traditional methods of software defect prediction mainly focus on designing static code metrics, which are input into machine learning classifiers to predict defect probabilities of the code. However, the characteristics of these artificial metrics do not contain the syntactic structures and semantic information of programs. Such information is more significant than manual metrics and can provide a more accurate predictive model. In this paper, we propose a framework called defect prediction via attention-based recurrent neural network (DP-ARNN). More specifically, DP-ARNN first parses abstract syntax trees (ASTs) of programs and extracts them as vectors. Then it encodes vectors which are used as inputs of DP-ARNN by dictionary mapping and word embedding. After that, it can automatically learn syntactic and semantic features. Furthermore, it employs the attention mechanism to further generate significant features for accurate defect prediction. To validate our method, we choose seven open-source Java projects in Apache, using F1-measure and area under the curve (AUC) as evaluation criteria. The experimental results show that, in average, DP-ARNN improves the F1-measure by 14% and AUC by 7% compared with the state-of-the-art methods, respectively.

show abstract

Section: Related Workmentioning

confidence: 99%

Software Defect Prediction via Attention-Based Recurrent Neural Network

Fan

Diao

et al. 2019

Scientific Programming

View full text Add to dashboard Cite

show abstract

“…To verify the validity of the TCNN method, we selected 10 open-source projects as our evaluation datasets. The source code and corresponding PROMISE data for all 10 projects are public and have been widely used in SDP research [21,[25][26][27]. In our experiments, we extracted DL-generated features from the Java source code and adopted the static code metrics and data labels from the PROMISE repository.…”

Section: Evaluated Datasetsmentioning

confidence: 99%

“…There are other measures (e.g., AUC and G-measure) that can be used for performance evaluation of dichotomous classifiers. In fact, the F-measure as a comprehensive measurement is a commonly-used evaluation metric in SDP tasks [21,25,26,[35][36][37].…”

Section: The F-measure Might Not Be the Only Appropriate Measuresmentioning

confidence: 99%

Transfer Convolutional Neural Network for Cross-Project Defect Prediction

Qiu

Deng

et al. 2019

Applied Sciences

Self Cite

View full text Add to dashboard Cite

Cross-project defect prediction (CPDP) is a practical solution that allows software defect prediction (SDP) to be used earlier in the software lifecycle. With the CPDP technique, the software defect predictor trained by labeled data of mature projects can be applied for the prediction task of a new project. Most previous CPDP approaches ignored the semantic information in the source code, and existing semantic-feature-based SDP methods do not take into account the data distribution divergence between projects. These limitations may weaken defect prediction performance. To solve these problems, we propose a novel approach, the transfer convolutional neural network (TCNN), to mine the transferable semantic (deep-learning (DL)-generated) features for CPDP tasks. Specifically, our approach first parses the source file into integer vectors as the network inputs. Next, to obtain the TCNN model, a matching layer is added into convolutional neural network where the hidden representations of the source and target project-specific data are embedded into a reproducing kernel Hilbert space for distribution matching. By simultaneously minimizing classification error and distribution divergence between projects, the constructed TCNN could extract the transferable DL-generated features. Finally, without losing the information contained in handcrafted features, we combine them with transferable DL-generated features to form the joint features for CPDP performing. Experiments based on 10 benchmark projects (with 90 pairs of CPDP tasks) showed that the proposed TCNN method is superior to the reference methods.

show abstract

“…The associate editor coordinating the review of this manuscript and approving it for publication was Yang Liu. and the target project are from the same project, SDP can be divided into Within-Project Defect Prediction (WPDP) [19], [34], [35], [44] and Cross-Project Defect Prediction (CPDP) [32]. In the early stages of a project, it is difficult for WPDP to build a feasible predictive model due to the lack of labeled file information.…”

Section: Introductionmentioning

confidence: 99%

An Adversarial Discriminative Convolutional Neural Network for Cross-Project Defect Prediction

Lei

Lin

2020

IEEE Access

Self Cite

View full text Add to dashboard Cite

Cross-project defect prediction (CPDP) is a promising approach to help to allocate testing efforts efficiently and guarantee software reliability in the early software lifecycle. A CPDP method usually trains a software defect classifier based on labeled data sets. Then the trained classifier can predict new projects without labeled data. Most previous CPDP techniques focused on manually designing handcrafted features. However, these handcrafted features ignore the programs' semantic information. Moreover, some other existing defect prediction approaches learned semantic features from source code to build classifiers directly. However, they did not consider the distribution divergence between source and target projects. To address these limitations, we put forward a new method called Adversarial Discriminative Convolutional Neural Network (ADCNN). It can extract the transferable semantic features from source code for CPDP tasks. Specifically, we first parse source files into token vectors and then map them to integer vectors via word embedding. Second, we combine adversarial learning with discriminative feature learning to train the ADCNN model. The key of the ADCNN model is to learn the discriminative mapping of the target project to the source feature space by deceiving a domain discriminator. A domain discriminator tries to distinguish the target project files from the source project files. Finally, we use the extracted transferable semantic features to build a classifier for CPDP tasks. We evaluate our method on ten benchmark projects in terms of Fmeasure, AUC, and PofB20 (an effort-aware evaluation metric). The experimental results demonstrate that our ADCNN method performs better compared with other related CPDP methods. INDEX TERMS Cross-project defect prediction, transfer learning, adversarial learning, deep learning, convolutional neural network.

show abstract

Multiple‐components weights model for cross‐project software defect prediction

Cited by 20 publications

References 33 publications

Software Defect Prediction via Attention-Based Recurrent Neural Network

Software Defect Prediction via Attention-Based Recurrent Neural Network

Transfer Convolutional Neural Network for Cross-Project Defect Prediction

An Adversarial Discriminative Convolutional Neural Network for Cross-Project Defect Prediction

Contact Info

Product

Resources

About