Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2021
DOI: 10.1016/j.scico.2021.102713
|View full text |Cite
|
Sign up to set email alerts
|

Improving performance with hybrid feature selection and ensemble machine learning techniques for code smell detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 40 publications
(17 citation statements)
references
References 85 publications
0
14
1
Order By: Relevance
“…As shown in Figure 4 and Figure 5, SMOTE does achieve significant improvement over the None technique on Data Class, God Class, and Long Method across our data sets, and obtains non-significant improvement on Feature Envy. Therefore, researchers and practitioners may still consider using SMOTE as a preprocessing method in line with previous studies Akhter et al (2021); Alkharabsheh et al (2021); Gupta et al (2021); Jain and Saha (2021); Stefano et al (2021); Khleel and Nehéz (2022);Kovačević et al (2022); Nanda and Chhabra (2022); Yedida and Menzies (2022), but should also consider exploring other techniques that may be more effective. Our results in Section 5.3 demonstrate that SMOTE does not consistently achieve the best performance on all four data sets, and the top-performing data resampling technique outperforms SMOTE by 2.63%-17.73% in terms of MCC.…”
Section: Discussionmentioning
confidence: 67%
See 3 more Smart Citations
“…As shown in Figure 4 and Figure 5, SMOTE does achieve significant improvement over the None technique on Data Class, God Class, and Long Method across our data sets, and obtains non-significant improvement on Feature Envy. Therefore, researchers and practitioners may still consider using SMOTE as a preprocessing method in line with previous studies Akhter et al (2021); Alkharabsheh et al (2021); Gupta et al (2021); Jain and Saha (2021); Stefano et al (2021); Khleel and Nehéz (2022);Kovačević et al (2022); Nanda and Chhabra (2022); Yedida and Menzies (2022), but should also consider exploring other techniques that may be more effective. Our results in Section 5.3 demonstrate that SMOTE does not consistently achieve the best performance on all four data sets, and the top-performing data resampling technique outperforms SMOTE by 2.63%-17.73% in terms of MCC.…”
Section: Discussionmentioning
confidence: 67%
“…Our findings are based on data sets provided by Fontana et al Fontana et al (2016), which are derived from 74 systems in the Qualitas corpus. While these code smell data sets are widely used in recent studies Nucci et al (2018); ; Jain and Saha (2021), we cannot guarantee that our conclusions will hold true for other data sets. Current researches Azeem et al (2019); Pecorelli et al (2020); Alkharabsheh et al (2022) in CSD tends to treat code smells as a binary classification problem, meaning that a code block is either classified as having a particular smell or not having that smell.…”
Section: Threats To Validitymentioning
confidence: 86%
See 2 more Smart Citations
“…These methods are very computationally expensive and often unrealistic if the feature space is vast, (iii) Embedded methods: in these methods, feature selection is a part of building ML algorithms. These methods select the best possible feature subset as per the ML model to be implemented [41]. In this study, we applied embedded methods because it is faster and less computationally expensive than other methods and it fits ML models and feature scaling technique was applied to make the output the same standard.…”
Section: 3data Pre-processing and Features Selectionmentioning
confidence: 99%