Software Defect Prediction Using Dictionary Learning

Wan, Hongyan; Wu, Guoqing; Cheng, Ming; Huang, Qing; Wang, Rui; Yuan, Mengting

doi:10.18293/seke2017-188

Cited by 7 publications

(2 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a result, software defects have become a significant challenge in software development, as they can lead to significant losses, including financial losses, damage to the reputation of the company, and even loss of life in extreme cases [2]. Software defects are expensive to fix, and they can cause project delays, leading to increased costs and lost productivity [3,4]. Therefore, software defect prediction has become an essential aspect of software engineering, as it helps to identify potential defects in advance before they cause any significant issues.…”

Section: Introductionmentioning

confidence: 99%

Software Defect Prediction System Based on Decision Tree Algorithm

Chinenye,

Anyachebelu,

Abdullahi

2023

AJRCoS

View full text Add to dashboard Cite

Software defect prediction plays a crucial role in ensuring software quality and minimizing the potential risks associated with defects. This study aims to develop a comprehensive software defect prediction system that utilizes tree-based algorithms to enhance accuracy, feature selection, and evaluation metrics. The study addresses the limitations of previous research by considering a broader range of datasets, comparing computational efficiency with other ensemble techniques, and examining the impact of hyperparameters on model performance. The implemented system consists of three stages: dataset loading, processing, and result presentation. The dataset loading page allows users to upload their datasets in CSV format, simplifying the prediction process. The processing page performs essential tasks such as feature engineering, normalization using minimax normalization, and training the model with the decision tree algorithm. These steps ensure the extraction of relevant features, transformation of data, and learning of patterns and correlations for accurate software defect prediction. The study emphasizes the practical implementation of the developed system, going beyond model evaluation. By providing a fully functional and integrated system, this study bridges the gap between research and real-world application. The findings of this study contribute to the field of software defect prediction by offering an improved system that enhances accuracy, feature selection, and evaluation metrics. This has implications for software development and quality assurance processes, ultimately leading to higher software quality and increased productivity.

show abstract

Section: Introductionmentioning

confidence: 99%

Software Defect Prediction System Based on Decision Tree Algorithm

Chinenye,

Anyachebelu,

Abdullahi

2023

AJRCoS

View full text Add to dashboard Cite

show abstract

“…While solving the software defect prediction problem, incorporation of both labeled and unlabeled data in the machine learning process may lead to best possible classification results. To this end, many researchers have used Graph based learning with application of sparse theory on the dataset for pairwise relationship [8]; collaborative representation by the authors [9]; metrics-based [10]; class imbalance [11]; Dictionary learning [12], traditional methods like: Support Vector Machine (SVM) [13] , Naive Bayesian (NB) [14], Neural Network [15] and the list goes on. It is observed that performance of the traditional methods severely limited with respect to lack of common feature representation and selection of a good feature selection algorithm in order to deal with sparse nature of the software prediction dataset.…”

Section: Introductionmentioning

confidence: 99%

Software Defect Prediction Using Hybrid Distribution Base Balance Instance Selection and Radial Basis Function Classifier

Panda

2019

International Journal of System Dynamics Applications

View full text Add to dashboard Cite

Software are becoming an indigenous part of human life with the rapid development of software engineering, demands the software to be most reliable. The reliability check can be done by efficient software testing methods using historical software prediction data for development of a quality software system. Machine Learning plays a vital role in optimizing the prediction of defect prone modules in real life software for its effectiveness. The software defect prediction data has class imbalance problem with low ratio of defective class to non-defective class, urges an efficient machine learning classification technique which otherwise degrades the performance of the classification. To alleviate this problem, this paper introduces a novel hybrid instance based classification by combining distribution base balance based instance selection and radial basis function neural network classifier model (DBBRBF) to obtain best prediction in comparison to the existing research. Class imbalanced data sets of NASA, Promise and Softlab were used for the experimental analysis. The experimental results in terms of Accuracy, F-measure, AUC, Recall, Precision and Balance show the effectiveness of the proposed approach. Finally, Statistical significance tests are carried out to understand the suitability of the proposed model. with possible threats to validity of our approach provided in Section 6. Finally, Conclusion and future scope is presented in Section 7. Related workThe authors [18] propose a novel machine learning approach using multiple linear regression model to predict bug proneness in software defect prediction Eclipse JDT Core data. Considering Software defect prediction as a classification task, the authors proposes SMOTE (Synthetic Minority Over-sampling Technique) ensemble based approach to effectively deal with the class imbalance problem of the datasets used and to achieve high accuracy [19]. In [20], the authors propose to help software developers by identifying software defects basing on the existing software metrics with various classification techniques. It is proposed to evaluate software defect prediction via Maximal Information Coefficient with Hierarchical Agglomerative Clustering (MICHAC) method on 11 widely studied NASA projects using three different classifiers such as: Naive Bayes, RIPPER and Random Forest) with four performance metrics (precision, recall, F-measure, and AUC) and opines their effectiveness in comparison to others [21]. The authors [22] discusses the application of data mining in software defect prediction for both static and dynamic defects, clone defects etc and highlights its importance to assist in software engineering tasks. A good overview on the data quality of the NASA MDP datasets [24] is presented in [23] as reported in [25] where comprehensive rules for data cleansing are used for software defect prediction. Six state-of-the-art within-project defect prediction approaches such as: naive Bayes, Decision tree, Logistic regression, K-nearest neighbor, random forest and Bayesian n...

show abstract