2022
DOI: 10.21203/rs.3.rs-2305042/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Novel Approach for Software Defect Prediction using CNN and GRU Based on SMOTE Tomek Method

Abstract: Software defect prediction (SDP) plays an important role in enhancing the quality of software projects and reducing maintenance-based risks through the ability to detect defective software components. SDP refers to the methods that use historical defect data to build the relationship between software metrics and software defects. Several prediction models such as machine learning (ML), deep learning (DL) have been developed and adopted to recognize defect in software modules and many methodologies and framewor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 36 publications
0
3
0
Order By: Relevance
“…32 Thus, to take the advantages of both imbalance techniques, in this research we utilized SMOTETomek. 33 SMOTETomek is a hybrid sampling technique that combines the oversampling (SMOTE) and undersampling (Tomek Links) method and has widely been acknowledged in many domains such as software defect prediction, 32 medical data (diabetes), 34 for balancing the skewed data. In other words, the key concept of using this algorithm is to combine SMOTE method as data sampling and Tomek link as data cleaning method proposed by Tomek 35 to address the issue of imbalance data set.…”
Section: ■ Materials and Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…32 Thus, to take the advantages of both imbalance techniques, in this research we utilized SMOTETomek. 33 SMOTETomek is a hybrid sampling technique that combines the oversampling (SMOTE) and undersampling (Tomek Links) method and has widely been acknowledged in many domains such as software defect prediction, 32 medical data (diabetes), 34 for balancing the skewed data. In other words, the key concept of using this algorithm is to combine SMOTE method as data sampling and Tomek link as data cleaning method proposed by Tomek 35 to address the issue of imbalance data set.…”
Section: ■ Materials and Methodsmentioning
confidence: 99%
“… 30 Synthetic minority oversampling (SMOTE) 31 considers the minority class while in contrast random under-sampling considers the majority class to equalize the class distribution. 32 Thus, to take the advantages of both imbalance techniques, in this research we utilized SMOTETomek. 33 SMOTETomek is a hybrid sampling technique that combines the oversampling (SMOTE) and undersampling (Tomek Links) method and has widely been acknowledged in many domains such as software defect prediction, 32 medical data (diabetes), 34 for balancing the skewed data.…”
Section: Methodsmentioning
confidence: 99%
“…Thirdly, compared to other methods, evolutionary features achieve best results, but the corresponding feature dimensions are also relatively higher, especially for the M495 dataset. Fourthly, the predictive accuracy of New-All-2 is not only best but also has lower feature dimensions, outperforming the combined methods of Group1, Group2, and Group3: To deal with the problem of sample imbalance, four different sampling methods were employed for comparison-SMOTE [24], SMOTEENN [25], ADASYN [28], and SMOTETOMEK [29]. The results are presented in Table 7.…”
Section: Effect Of Various Feature Extraction Methodsmentioning
confidence: 99%