Unsupervised Deep Domain Adaptation for Heterogeneous Defect Prediction

Gong, Lina; Jiang, Shujuan; Yu, Qiao; Jiang, Li

doi:10.1587/transinf.2018edp7289

Cited by 11 publications

(10 citation statements)

References 42 publications

(64 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The fact that a large variety of different projects, versions and features is used in SDP leads to highly heterogeneous data, in particular when using different source and target for prediction. Such data degrade the performance of the classifier (Albahli, 2019, Gong et al, 2019Qiu et al, 2019a;Qiu et al, 2019b;Sheng et al, 2020;Sun et al, 2020a;Huang et al, 2021;Sun et al, 2021;Wu et al, 2021). Some researchers have tackled this challenge using different DL architectures which take this difference into account, while others have introduced normalization and transformation steps in data preprocessing as well as in feature extraction.…”

Section: Data Engineeringmentioning

confidence: 99%

“…Hence, they proposed to learn a high-level feature representation from a bug dataset consisting of 19 mobile applications for JIT defect prediction Zeng et al (2021). used DL to build models for identifying defective commits in both WPDP and CPDP settings.Three studies addressed HDP Gong et al (2019). designed a neural network to deal with heterogeneous metric sets for defect prediction Sun et al (2021).…”

mentioning

confidence: 99%

See 1 more Smart Citation

On the use of deep learning in software defect prediction

Giray¹,

Bennin

Köksal

et al. 2023

Journal of Systems and Software

View full text Add to dashboard Cite

Section: Data Engineeringmentioning

confidence: 99%

mentioning

confidence: 99%

On the use of deep learning in software defect prediction

Giray¹,

Bennin

Köksal

et al. 2023

Journal of Systems and Software

View full text Add to dashboard Cite

“…A simple Neural Network was being proposed for CPDP in the year 2019 to tackle HDP in which cross entropy function was applied for classification of error. 21 In the year 2021, researchers presented the work on semi-supervised learning for tackling heterogeneous defect prediction. The open-source projects were being utilized for the analysis.…”

Section: State Of Artmentioning

confidence: 99%

An optimized approach for class imbalance problem in heterogeneous cross project defect prediction

Goel¹,

Nandal²,

Gupta

2022

F1000Res

View full text Add to dashboard Cite

Background: In recent studies, Cross Project Defect Prediction (CPDP) has proven to be feasible in software defect prediction. When both the source as well as the target projects have the same metric sets, it is termed as a homogeneous CPDP. Current CPDP strategies are difficult to implement through projects with a variety of different metric sets. Aside from that, training data often has a problem with class imbalance. The number of defective/bug-ridden and non-defective/clean instances of the source class is usually unbalanced. To address this issue, we propose a heterogeneous cross-project defect prediction framework that can predict defects across projects with different metric sets. Methods: To construct a prediction framework between projects with heterogeneous metric sets, our heterogeneous cross project defect prediction approach uses metric selection, metric matching, class imbalance (CIB) learning followed by ensemble modelling. For our study, we have considered six open-source object-oriented projects. Results: The proposed model resolved the class imbalance issue and records the highest recall value of 7.5 with f-score value as 7.4 in comparison with other baseline models. The highest AUC (area under curve) value of 0.86 has also been recorded. K fold cross validation was performed to evaluate the training accuracy of the model. The proposed optimized model was validated using the Wilcoxon signed rank test (WSR) with a significance level of 5% (i.e., P-value=0.05). Conclusions: Our empirical research on these six projects shows that predictions based on our methodology outperform or are statistically comparable to Within-Project Defect Prediction (WPDP) and other heterogeneous CPDP baseline models.

show abstract

“…The number of neurons of output layer in generator is the number of metrics in target project, while the number of neurons of output layer in discriminator and classifier are one. In addition, based on method suggested by reference [54], we tune the parameter in the CDAA networks. Table 5 displays the hyper-parameter values set in our CDAA networks.…”

Section: Parameter Settingsmentioning

confidence: 99%