ObjectivesCoronary heart disease is the leading cause of death worldwide, and it is important to diagnose the level of the disease. Intelligence systems for diagnosis proved can be used to support diagnosis of the disease. Unfortunately, most of the data available between the level/type of coronary heart disease is unbalanced. As a result system performance is low.MethodsThis paper proposes an intelligence systems for the diagnosis of the level of coronary heart disease taking into account the problem of data imbalance. The first stage of this research was preprocessing, which included resampled non-stratified random sampling (R), the synthetic minority over-sampling technique (SMOTE), clean data out of range attribute (COR), and remove duplicate (RD). The second step was the sharing of data for training and testing using a k-fold cross-validation model and training multiclass classification by the K-star algorithm. The third step was performance evaluation. The proposed system was evaluated using the performance parameters of sensitivity, specificity, positive prediction value (PPV), negative prediction value (NPV), area under the curve (AUC) and F-measure.ResultsThe results showed that the proposed system provides an average performance with sensitivity of 80.1%, specificity of 95%, PPV of 80.1%, NPV of 95%, AUC of 87.5%, and F-measure of 80.1%. Performance of the system without consideration of data imbalance provide showed sensitivity of 53.1%, specificity of 88,3%, PPV of 53.1%, NPV of 88.3%, AUC of 70.7%, and F-measure of 53.1%.ConclusionsBased on these results it can be concluded that the proposed system is able to deliver good performance in the category of classification.
Improved system performance diagnosis of coronary heart disease becomes an important topic in research for several decades. One improvement would be done by features selection, so only the attributes that influence is used in the diagnosis system using data mining algorithms. Unfortunately, the most feature selection is done with the assumption has provided all the necessary attributes, regardless of the stage of obtaining the attribute, and cost required. This research proposes a hybrid model system for diagnosis of coronary heart disease. System diagnosis preceded the feature selection process, using tiered multivariate analysis. The analytical method used is logistic regression. The next stage, the classification by using multi-layer perceptron neural network. Based on test results, system performance proposed value for accuracy 86.3%, sensitivity 84.80%, specificity 88.20%, positive prediction value (PPV) 90.03%, negative prediction value (NPV) 81.80%, accuracy 86,30% and area under the curve (AUC) of 92.1%. The performance of a diagnosis using a combination attributes of risk factors, symptoms and exercise ECG. The conclusion that can be drawn is that the proposed diagnosis system capable of delivering performance in the very good category, with a number of attributes that are not a lot of checks and a relatively low cost.
Keyword:Coronary heart disease Diagnosis Logistic regression Multi-layer neural network Multivariate
ObjectivesThe interpretation of clinical data for the diagnosis of coronary heart disease can be done using algorithms in data mining. Most clinical data interpretation systems for diagnosis developed using data mining algorithms with a black-box approach cannot recognize examination attribute relationships with the incidence of coronary heart disease.MethodsThis study proposes a system to interpretation clinical examination results for the diagnosis of coronary heart disease based the decision tree algorithm. This system comprises several stages. First, oversampling is carried out by a combination of the synthetic minority oversampling technique (SMOTE), feature selection, and the C4.5 classification algorithm. System testing is done using k-fold cross-validation. The performance parameters are sensitivity, specificity, positive prediction value (PPV), negative prediction value (NPV) and the area under the curve (AUC).ResultsThe results showed that the performance of the system has a sensitivity of 74.7%, a specificity of 93.7%, a PPV of 74.2%, an NPV of 93.7%, and an AUC of 84.2%.ConclusionsThis study demonstrated that, by using C4.5 algorithms, data can be interpreted in the form of a decision tree, to aid the understanding of the clinician. In addition, the proposed system can provide better performance by category.
Coronary heart disease is a disease with the highest mortality rates in the world. This makes the development of the diagnostic system as a very interesting topic in the field of biomedical informatics, aiming to detect whether a heart is normal or not. In the literature there are diagnostic system models by combining dimension reduction and data mining techniques. Unfortunately, there are no review papers that discuss and analyze the themes to date. This study reviews articles within the period 2009-2016, with a focus on dimension reduction methods and data mining techniques, validated using a dataset of UCI repository. Methods of dimension reduction use feature selection and feature extraction techniques, while data mining techniques include classification, prediction, clustering, and association rules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.