Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering 2014
DOI: 10.1145/2601248.2601294
|View full text |Cite
|
Sign up to set email alerts
|

Preliminary comparison of techniques for dealing with imbalance in software defect prediction

Abstract: Imbalanced data is a common problem in data mining when dealing with classification problems, where samples of a class vastly outnumber other classes. In this situation, many data mining algorithms generate poor models as they try to optimize the overall accuracy and perform badly in classes with very few samples. Software Engineering data in general and defect prediction datasets are not an exception and in this paper, we compare different approaches, namely sampling, cost-sensitive, ensemble and hybrid appro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
75
0
3

Year Published

2016
2016
2021
2021

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 118 publications
(78 citation statements)
references
References 46 publications
(39 reference statements)
0
75
0
3
Order By: Relevance
“…Recent defect prediction studies have compared the impact of class rebalancing techniques on the perfor-mance of defect prediction models [34,45,63,65,68,78,90]. For example, Kamei et al [34] show the performance improvement of class rebalancing techniques on 2 defect datasets of proprietary systems.…”
Section: Introductionmentioning
confidence: 99%
“…Recent defect prediction studies have compared the impact of class rebalancing techniques on the perfor-mance of defect prediction models [34,45,63,65,68,78,90]. For example, Kamei et al [34] show the performance improvement of class rebalancing techniques on 2 defect datasets of proprietary systems.…”
Section: Introductionmentioning
confidence: 99%
“…The different approaches to deal with imbalanced can be classified into data as sampling, cost-sensitive, ensemble approaches or hybrid approaches [11]. In this work, we compare only rule or tree classifiers taking into account the imbalance nature of software defect datasets.…”
Section: Previous Workmentioning
confidence: 99%
“…2014 IEEE Xplore Table 3. Continue Felderer et al (2014) On the role of defect taxonomy types for testin 2014 IEEE Xplore requirements: Results of a controlled experiment Rodriguez et al (2014) Preliminary comparison of techniques for dealing with 2014 ACM Digital Library imbalance in software defect prediction Femmer et al (2014) Rapid requirements checks with requirements smells: 2014 ACM Digital Library Two case studies Yusop et al (2016) Reporting usability defects: Do reporters report what 2016 ACM Digital Library software developers need? Cavezza et al (2014) Reproducibility of environment-dependent software 2014 IEEE Xplore failures: An experience report Langenfeld et al (2016) Requirements defects over a project lifetime: An 2016 Compendex SpringerLink empirical analysis of defect data from a 5-year automotive project at bosch Saito et al (2014) RISDM: A requirements inspection systems design 2014 IEEE Xplore methodology -perspective-based design of the pragmatic quality model and question set to SRS Travassos (2014) Software defects: Stay away from them.…”
Section: Data Sources X Rqs X Research Typesmentioning
confidence: 99%