The software engineering community is working to develop reliable metrics to improve software quality. It is estimated that understanding the source code accounts for 60% of the software maintenance effort. Cognitive informatics is important in quantifying the degree of difficulty or the efforts made by developers to understand the source code. Several empirical studies were conducted in 2003 to assign cognitive weights to each possible basic control structure of software, and these cognitive weights are used by several researchers to evaluate the cognitive complexity of software systems. In this paper, an effort has been made to categorize the Control Flow Graphs (CFGs) nodes according to their node features. In our case, we extracted seven unique features from the program, and each unique feature was assigned an integer value that we evaluated through Cognitive Complexity Measures (CCMs). We then incorporated CCMs' results as a node feature value in CFGs and generated the same based on the node connectivity for a graph. In order to obtain the feature representation of the graph, a node vector matrix is then created for the graph and passed to the Graph Convolutional Network (GCN). We prepared our data sets using GCN output and then built Deep Neural Network Defect Prediction (DNN-DP) and Convolutional Neural Network Defect Prediction (CNN-DP) models to predict software defects. The Python programming language is used, along with Keras and TensorFlow. Three hundred twenty Python programs were written by our talented UG and PG students, and all experiments were carried out during laboratory classes. Together with three skilled lab programmers, they compiled and ran each individual program and detected defect/no-defect programs before categorizing them into three different classes, namely Simple, Medium, and Complex programs. Accuracy, Receiver Operating Characteristics (ROC), Area Under Curve (AUC), F-measure, Precision and hyperparameter tuning procedures are used to evaluate the approaches. The experimental results show that the proposed models outperformed state-of-the-art methods such as Nave Bayes (NB), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF) in all evaluation criteria.
A healthy life is essential for a happy society, however it is a fact that seemingly invisible diseases plague our families and people suffer. The thyroid disease falls in such a category. Thyroid disorders are long-term and with carefully handled illnesses, people with thyroid disorders may also live stable and normal lives. Thyroid diagnosis, particularly for an inexperienced clinician, is a difficult proposal. Many researchers have established various methods for the diagnosis of the disease and several models for disease prediction have been developed. As with several other domains, machine learning approaches to modelling health care problems is gaining popularity. This study aims at providing solutions towards such a thyroid disease prediction. Dimension reduction techniques are applied, and reduced dimension data input to classifiers. Also, data augmentation is applied so as to be able to generate sufficient data for deep neural network model. Classifier prediction is compared to other similar researches. Real life dataset for thyroid disease has been used, and experiments conducted in distributed environment. Our proposed two stage approach gives a maximum accuracy of 99.95% which is very good as compared to existing techniques. We have shown that dimension reduction and data augmentation can be used very efficiently for achieving high accuracy of disease prediction.
Deep neural network models built by the appropriate design decisions are crucial to obtain the desired classifier performance. This is especially desired when predicting fault proneness of software modules. When correctly identified, this could help in reducing the testing cost by directing the efforts more towards the modules identified to be fault prone. To be able to build an efficient deep neural network model, it is important that the parameters such as number of hidden layers, number of nodes in each layer, and training details such as learning rate and regularization methods be investigated in detail. The objective of this paper is to show the importance of hyperparameter tuning in developing efficient deep neural network models for predicting fault proneness of software modules and to compare the results with other machine learning algorithms. It is shown that the proposed model outperforms the other algorithms in most cases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.