Application of Naïve Bayes classifiers for refactoring Prediction at the method level

Panigrahi, Rasmita; Kuanar, Sanjay K.; Kumar, Lov

doi:10.1109/iccsea49143.2020.9132849

Cited by 9 publications

(3 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The author's results suggest that leveraging confirmation messages significantly improved the accuracy of recommending refactorings. Panigrahi et al (2020) conducted a study in which they proposed models based on Naive Bayes classifiers (Gaussian, Multinomial and Bernoulli) to predict method-level software refactorings. In addition, the authors used techniques such as SMOTE, UPSAMPLE and RUSBOOTS for data balancing.…”

Section: Related Workmentioning

confidence: 99%

On the Effectiveness of Trivial Refactorings in Predicting Non-trivial Refactorings

Pinheiro,

Bezerra,

Uchôa

2024

JSERD

View full text Add to dashboard Cite

Refactoring is the process of restructuring source code without changing the external behavior of the software. Refactoring can bring many benefits, such as removing code with poor structural quality, avoiding or reducing technical debt, and improving maintainability, reuse, or code readability. Although there is research on how to predict refactorings, there is still a clear lack of studies that assess the impact of operations considered less complex (trivial) to more complex (non-trivial). In addition, the literature suggests conducting studies that invest in improving automated solutions through detecting and correcting refactoring. This study aims to identify refactoring activity in non-trivial operations through trivial operations accurately. For this, we use classifier models of supervised learning, considering the influence of trivial refactorings and evaluating performance in other data domains. To achieve this goal, we assembled 3 datasets totaling 1,291 open-source projects, extracted approximately 1.9M refactoring operations, collected 45 attributes and code metrics from each file involved in the refactoring and used the algorithms Decision Tree, Random Forest, Logistic Regression, Naive Bayes and Neural Network of supervised learning to investigate the impact of trivial refactorings on the prediction of non-trivial refactorings. For this study, we contextualize the data and call context each experiment configuration in which it combines trivial and non-trivial refactorings. Our results indicate that: (i) Tree-based models such as Random Forest, Decision Tree, and Neural Networks performed very well when trained with code metrics to detect refactoring opportunities. However, only the first two were able to demonstrate good generalization in other data domain contexts of refactoring; (ii) Separating trivial and non-trivial refactorings into different classes resulted in a more efficient model. This approach still resulted in a more efficient model even when tested on different datasets; (iii) Using balancing techniques that increase or decrease samples may not be the best strategy to improve models trained on datasets composed of code metrics and configured according to our study.

show abstract

Section: Related Workmentioning

confidence: 99%

On the Effectiveness of Trivial Refactorings in Predicting Non-trivial Refactorings

Pinheiro,

Bezerra,

Uchôa

2024

JSERD

View full text Add to dashboard Cite

show abstract

“…Panigrahi, Kuanar, and Kumar [44] proposed a model for predicting the opportunities of using refactoring at the method level by using three Naïve Bayes classifiers (Bernoulli (GNB, MNB, BNB), Gaussian, and Multinomial). The results of the experiment on the performance of the three Nave Bayes classifiers demonstrated that the Bernoulli Nave Bayes classifier outperforms the other two classifiers in terms of accuracy.…”

Section: A Machine Learning-based Refactoring Predictionsmentioning

confidence: 99%

Revisiting Scenarios of Using Refactoring Techniques to Improve Software Systems Quality

et al. 2023

View full text Add to dashboard Cite

Refactoring is one of the most widely used techniques in practice to improve the quality of existing software. However, it is observed that refactoring does not continually improve all software quality attributes. Recent studies indicated that different refactoring techniques have significantly different, sometimes opposite, and conflicting effects on software quality attributes. In other words, there is contradictory evidence on the refactoring benefit. As a result, developers face challenges in selecting appropriate refactoring techniques when they use them to improve software quality. To the best of our knowledge, no study has investigated factors that may explain inconsistent or diverging results concerning the effect of refactoring techniques on software quality. Therefore, in this study, scenarios of using refactoring techniques factor have been identified, investigated, and thoroughly analyzed. Ten of the most commonly used refactoring techniques in practice have been chosen and individually applied in seven case studies of varying sizes (small, medium, and large). The Quality Model for Object-Oriented Design (QMOOD) is used to assess how refactoring techniques affect quality attributes. The findings provide strong evidence that this factor plays a significant role in producing the various effects of refactoring techniques on quality attributes. These findings can help software developers understand how to use refactoring techniques to improve software quality while taking this factor into account. The best scenario for using each refactoring technique to improve software system quality has been identified. The findings can provide guidelines for software developers to use refactoring techniques to improve the quality of software systems based on the best scenarios of using the refactoring techniques.

show abstract

“…Classification Logistic regression 34 is used to solve multiclass problems when dependent variables are nominal by using logistic regression analysis.…”

Section: Multinomial Logistic Regressionmentioning

confidence: 99%

Severity classification of software code smells using machine learning techniques: A comparative study

Abdou

Darwish

2022

J Software Evolu Process

View full text Add to dashboard Cite

Code smell is a software characteristic that indicates bad symptoms in code design which causes problems related to software quality. The severity of code smells must be measured because it will help the developers when determining the priority of refactoring efforts. Recently, several studies focused on the prediction of design patterns errors using different detection tools. Nowadays, there is a lack of empirical studies regarding how to measure severity of code smells and which learning model is best to detect the severity of code smells. To overcome such gap, this paper focuses on measuring the severity classification of code smells depending on several machine learning models such as regression models, multinominal models, and ordinal classification models. The Local Interpretable Model Agnostic Explanations (LIME) algorithm was further used to explain the machine learning model's predictions and interpretability. On the other side, we extract the prediction rules generated by the Projective Adaptive Resonance Theory (PART) algorithm in order to study the effectiveness of using software metrics to predict code smells. The results of the experiments have shown that the accuracy of severity classification model is enhanced than baseline and ranking correlation between the predicted and actual model reaches 0.92–0.97 by using Spearman's correlation measure.

show abstract

Application of Naïve Bayes classifiers for refactoring Prediction at the method level

Cited by 9 publications

References 10 publications

On the Effectiveness of Trivial Refactorings in Predicting Non-trivial Refactorings

On the Effectiveness of Trivial Refactorings in Predicting Non-trivial Refactorings

Revisiting Scenarios of Using Refactoring Techniques to Improve Software Systems Quality

Severity classification of software code smells using machine learning techniques: A comparative study

Contact Info

Product

Resources

About