An empirical comparison of validation methods for software prediction models

Ali, Asad; Gravino, Carmine

doi:10.1002/smr.2367

Cited by 7 publications

(4 citation statements)

References 62 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Examples of hyper parameters include the number of epochs, batch size, and learning rate. Adjusting these hyper parameters may significantly impact the performance of a machine learning model and finding optimal values for a given problem is typically a critical step in the model development procedure [45]. Several tests were conducted before the current study to determine the appropriate value range.…”

Section: Rnn-based Deep Learning (Rnnbdl) Approachmentioning

confidence: 99%

Software Fault Prediction Using an RNN-Based Deep Learning Approach and Ensemble Machine Learning Techniques

Borandağ

2023

Applied Sciences

View full text Add to dashboard Cite

Alongside the modern software development life cycle approaches, software testing has gained more importance and has become an area researched actively within the software engineering discipline. In this study, machine learning and deep learning-related software fault predictions were made through a data set named SFP XP-TDD, which was created using three different developed software projects. A data set of five different classifiers widely used in the literature and their Rotation Forest classifier ensemble versions were trained and tested using this data set. Numerous publications in the literature discussed software fault predictions through ML algorithms addressing solutions to different problems. Some of these articles indicated the usage of feature selection algorithms to improve classification performance, while others reported operating ensemble machine learning algorithms for software fault predictions. Besides, a detailed literature review revealed that there were few studies involving software fault prediction with DL algorithms due to the small sample sizes in the data sets and the low success rates in the tests performed on these datasets. As a result, the major contribution of this research was to statistically demonstrate that DL algorithms outperformed ML algorithms in data sets with large sample values via employing three separate software fault prediction datasets. The experimental outcomes of a model that includes a layer of recurrent neural networks (RNNs) were enclosed within this study. Alongside the aforementioned and generated data sets, the study also utilized the Eclipse and Apache Active MQ data sets in to test the effectiveness of the proposed deep learning method.

show abstract

Section: Rnn-based Deep Learning (Rnnbdl) Approachmentioning

confidence: 99%

Software Fault Prediction Using an RNN-Based Deep Learning Approach and Ensemble Machine Learning Techniques

Borandağ

2023

Applied Sciences

View full text Add to dashboard Cite

show abstract

“…The selection of appropriate parameters is critical and a complicated aspect of network training due to constraints such as memory limitations, trade-offs are inherently present in parameter selection [3], [47]. Throughout the examination, several hyper-parameters to measure the influence of accuracy.…”

Section: Hyper Parametermentioning

confidence: 99%

“…Nested K fold cross validation is another way to tune parameters of an algorithms. Data is divided into k fold and one fold is reserved for test [3]. K-1 training folds used for validation.…”

Section: Cross Validationmentioning

confidence: 99%

Software fault prediction using deep learning techniques

Batool

Khan

2023

Software Qual J

View full text Add to dashboard Cite

Software fault prediction (SFP) techniques are used to identify faults at the early stages of the software development life cycle (SDLC). We find machine learning techniques as commonly used techniques for SFP as compared to deep learning methods which can produce more accurate results. Deep learning offers exceptional results in a variety of domains such as computer vision, natural language processing, speech recognition, etc. In this study, we use three deep learning methods, namely, Long Short Term Memory (LSTM), Bidirectional LSTM (BILSTM), and Radial Basis Function Network (RBFN) to predict software faults and compare our results with existing models to show how our results are more accurate. In our study, we use Chidamber and Kemerer (CK) metrics-based datasets to conduct experiments and test our proposed algorithm. We conclude that LSTM and BILSTM perform better whereas RBFN is faster in producing the required results. We use k-fold cross validation to do the model evaluation. Our proposed models provide a more accurate and efficient SFP mechanism to software developers.

show abstract

“…Model validation is the process where the trained model is evaluated with a testing data set to foresee how good the performance of the estimation method [59]. The testing data set is a separate portion of the same data set from which the training set is derived.…”

Section: Model Validationmentioning

confidence: 99%

Optimizing complexity weight parameter of use case points estimation using particle swarm optimization

Ardiansyah

Ferdiana

Permanasari

2022

Int. J. Adv. Intell. Informatics

View full text Add to dashboard Cite

Among algorithmic-based frameworks for software development effort estimation, Use Case Points I s one of the most used. Use Case Points is a well-known estimation framework designed mainly for object-oriented projects. Use Case Points uses the use case complexity weight as its essential parameter. The parameter is calculated with the number of actors and transactions of the use case. Nevertheless, use case complexity weight is discontinuous, which can sometimes result in inaccurate measurements and abrupt classification of the use case. The objective of this work is to investigate the potential of integrating particle swarm optimization (PSO) with the Use Case Points framework. The optimizer algorithm is utilized to optimize the modified use case complexity weight parameter. We designed and conducted an experiment based on real-life data set from three software houses. The proposed model’s accuracy and performance evaluation metric is compared with other published results, which are standardized accuracy, effect size, mean balanced residual error, mean inverted balanced residual error, and mean absolute error. Moreover, the existing models as the benchmark are polynomial regression, multiple linear regression, weighted case-based reasoning with (PSO), fuzzy use case points, and standard Use Case Points. Experimental results show that the proposed model generates the best value of standardized accuracy of 99.27% and an effect size of 1.15 over the benchmark models. The results of our study are promising for researchers and practitioners because the proposed model is actually estimating, not guessing, and generating meaningful estimation with statistically and practically significant.

show abstract

An empirical comparison of validation methods for software prediction models

Cited by 7 publications

References 62 publications

Software Fault Prediction Using an RNN-Based Deep Learning Approach and Ensemble Machine Learning Techniques

Software Fault Prediction Using an RNN-Based Deep Learning Approach and Ensemble Machine Learning Techniques

Software fault prediction using deep learning techniques

Optimizing complexity weight parameter of use case points estimation using particle swarm optimization

Contact Info

Product

Resources

About