2020
DOI: 10.3390/app10238324
|View full text |Cite
|
Sign up to set email alerts
|

LIMCR: Less-Informative Majorities Cleaning Rule Based on Naïve Bayes for Imbalance Learning in Software Defect Prediction

Abstract: Software defect prediction (SDP) is an effective technique to lower software module testing costs. However, the imbalanced distribution almost exists in all SDP datasets and restricts the accuracy of defect prediction. In order to balance the data distribution reasonably, we propose a novel resampling method LIMCR on the basis of Naïve Bayes to optimize and improve the SDP performance. The main idea of LIMCR is to remove less-informative majorities for rebalancing the data distribution after evaluating the deg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 51 publications
(63 reference statements)
0
9
0
Order By: Relevance
“…The optimum values were searched in the range of 1 and 30 (Kang and Ryu, 2019) and 100 and 1,200 (Baker et al, 2020a), respectively. Also, variance smoothing (10 −x , x between 3 and 9) was the only parameter to be considered in the determination of optimum model configuration for NB (Soni et al, 2020;Wu et al, 2020). In the identification of the KNN structure, three hyperparameters of the algorithm were scanned, the number of neighbors and the leaf size with the ranges of 1-30 for both parameters (Bykov et al, 2019;Zhang et al, 2020a, b) and four different distance metrics, i.e.…”
Section: Classification Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The optimum values were searched in the range of 1 and 30 (Kang and Ryu, 2019) and 100 and 1,200 (Baker et al, 2020a), respectively. Also, variance smoothing (10 −x , x between 3 and 9) was the only parameter to be considered in the determination of optimum model configuration for NB (Soni et al, 2020;Wu et al, 2020). In the identification of the KNN structure, three hyperparameters of the algorithm were scanned, the number of neighbors and the leaf size with the ranges of 1-30 for both parameters (Bykov et al, 2019;Zhang et al, 2020a, b) and four different distance metrics, i.e.…”
Section: Classification Resultsmentioning
confidence: 99%
“…Also, variance smoothing (10x, x between 3 and 9) was the only parameter to be considered in the determination of optimum model configuration for NB (Soni et al. , 2020; Wu et al. , 2020).…”
Section: Resultsmentioning
confidence: 99%
“…A set of conditions for ant lion hunting can be presented with the following and can be seen in Figure 7 [15]. Hunters (ants) move randomly in the search space.…”
Section: Ant Milk Optimization Algorithmmentioning
confidence: 99%
“…Figure (15): Behavioral status of group sailing fish [25] 4.5 Plant-inspired meta-exploratory algorithms…”
Section: Wall Optimization Algorithmmentioning
confidence: 99%
“…An extensive body of research on software defect prediction based on ML models exists. The literature approaches explore defective prediction models from many perspectives [28][29][30][31][32][33][34][35][36][37]. One of the most well-known datasets used in many of those studies is the NASA MDP open datasets [38,39].…”
Section: Introductionmentioning
confidence: 99%